Incremental Learning of Retrievable Skills for Efficient Continual Task Adaption

Introduction to Home Robots

Training: Models trained on multimodal datasets (vision, language, and actions).
Challenges: Adapting to environmental changes (e.g., new furniture, novel objects).
Proposed Solution: Tackle adaptation problems with incremental learning.

Preliminaries

Parameter Efficient Tuning:
- Fine-tunes large pre-trained models with minimal additional parameters.
- Key Method: LoRa (Low-Rank Adaptation).
LoRa's Functionality:
- Adjusts model weights using 2 smaller matrices, reducing parameter count and memory usage.
- Increases training cost efficiency by allowing faster fine-tuning.

Continual Learning vs. Traditional Machine Learning

Traditional ML: Models trained once on a static dataset.
Continual Learning:
- Models trained incrementally as new data arrives.
- Adaptation to new tasks or knowledge over time.
Focus: Mitigating catastrophic forgetting while sharing knowledge over time.

Imitation Learning

Definition: Learning through mimicking expert actions.
Process: Tasks demonstrated by humans or provided via code.
Deployment: Trained autonomous policy controls task execution.
Benefit: Effective where reward function design for reinforcement learning is complex.

Continual Imitation Learning (CIL)

Combination of Paradigms: Combines continual learning and imitation learning.
Data Stream: Expert demonstrations as the learning basis.
Evaluation: Tasks assessed through evolving criteria reflecting current requirements.
Adaptation Goal: Build versatile, adaptive robotic agents.

Challenges in CIL

Comprehensive Expert Demonstrations: Difficulty and inefficiency in gathering complete demonstrations.
Task Shifts in Dynamic Environments: Constantly changing tasks lead to adaptation difficulties.
Privacy Concerns: Knowledge accumulation may retain sensitive data inadvertently.

Proposed Framework: SCL (Skill-Centric Learning)

Process Overview:
- Stores expert demonstrations as paired skilled prototypes and adapters to learn expert knowledge.
- Retrieves relevant skills based on state similarity during task evaluation.
Skill Retrieval: Ensures efficient task adaptation and completion through previously learned skills.

Evaluation Scenarios

Simulation Environments:
- Use of Frangak Kitchen Simulator and MetaWorld Environment Simulator.
- Various task scenarios: complete, incomplete, and semi-complete.
Goal Conditioned Success Rate (GC):
- Measures success in sequential task execution.
- Reflects performance based on the successful completion of interdependent goals.

Metrics for Continual Imitation Learning

Forward Transfer (FWT): Evaluates learning new tasks from prior knowledge.
Backward Transfer (BWT): Measures how new tasks impact previously learned tasks (indicates catastrophic forgetting if negative).
Area Under Curve (AUC): Overall performance assessment across tasks and stages.

Experimental Results

Comparison to Conventional Methods: SCL outperformed traditional approaches in AUC across scenarios, especially with unseen tasks.
Performance on Privacy: Robustness in environments requiring unlearning of tasks, handling incomplete demonstrations well.

Conclusions

Key Contributions: SCL enhances flexibility and capability in continual imitation learning.
Future Directions:
- Generalization: Combining model merging and task arithmetic for improved robustness.
- Efficiency: Refining caching algorithms to speed up skill retrieval.

Closing Remarks

Summary: Advancements in continual imitation learning, emphasizing adaptability, robustness, and efficiency.
Thank You: Appreciation for audience attention.