Incremental Learning of Retrievable Skills for Efficient Continual Task Adaption
Introduction to Home Robots
Training: Models trained on multimodal datasets (vision, language, and actions).
Challenges: Adapting to environmental changes (e.g., new furniture, novel objects).
Proposed Solution: Tackle adaptation problems with incremental learning.
Preliminaries
Parameter Efficient Tuning:
Fine-tunes large pre-trained models with minimal additional parameters.
Key Method: LoRa (Low-Rank Adaptation).
LoRa's Functionality:
Adjusts model weights using 2 smaller matrices, reducing parameter count and memory usage.
Increases training cost efficiency by allowing faster fine-tuning.
Continual Learning vs. Traditional Machine Learning
Traditional ML: Models trained once on a static dataset.
Continual Learning:
Models trained incrementally as new data arrives.
Adaptation to new tasks or knowledge over time.
Focus: Mitigating catastrophic forgetting while sharing knowledge over time.
Imitation Learning
Definition: Learning through mimicking expert actions.
Process: Tasks demonstrated by humans or provided via code.
Deployment: Trained autonomous policy controls task execution.
Benefit: Effective where reward function design for reinforcement learning is complex.
Continual Imitation Learning (CIL)
Combination of Paradigms: Combines continual learning and imitation learning.
Data Stream: Expert demonstrations as the learning basis.
Evaluation: Tasks assessed through evolving criteria reflecting current requirements.
Adaptation Goal: Build versatile, adaptive robotic agents.
Challenges in CIL
Comprehensive Expert Demonstrations: Difficulty and inefficiency in gathering complete demonstrations.
Task Shifts in Dynamic Environments: Constantly changing tasks lead to adaptation difficulties.
Privacy Concerns: Knowledge accumulation may retain sensitive data inadvertently.
Proposed Framework: SCL (Skill-Centric Learning)
Process Overview:
Stores expert demonstrations as paired skilled prototypes and adapters to learn expert knowledge.
Retrieves relevant skills based on state similarity during task evaluation.
Skill Retrieval: Ensures efficient task adaptation and completion through previously learned skills.
Evaluation Scenarios
Simulation Environments:
Use of Frangak Kitchen Simulator and MetaWorld Environment Simulator.
Various task scenarios: complete, incomplete, and semi-complete.
Goal Conditioned Success Rate (GC):
Measures success in sequential task execution.
Reflects performance based on the successful completion of interdependent goals.
Metrics for Continual Imitation Learning
Forward Transfer (FWT): Evaluates learning new tasks from prior knowledge.
Backward Transfer (BWT): Measures how new tasks impact previously learned tasks (indicates catastrophic forgetting if negative).
Area Under Curve (AUC): Overall performance assessment across tasks and stages.
Experimental Results
Comparison to Conventional Methods: SCL outperformed traditional approaches in AUC across scenarios, especially with unseen tasks.
Performance on Privacy: Robustness in environments requiring unlearning of tasks, handling incomplete demonstrations well.
Conclusions
Key Contributions: SCL enhances flexibility and capability in continual imitation learning.
Future Directions:
Generalization: Combining model merging and task arithmetic for improved robustness.
Efficiency: Refining caching algorithms to speed up skill retrieval.
Closing Remarks
Summary: Advancements in continual imitation learning, emphasizing adaptability, robustness, and efficiency.
Thank You: Appreciation for audience attention.