Looks like no one added any tags here yet for you.
Language Model
A probabilistic model of text that assigns probabilities to words in its vocabulary when given input.
Vocabulary Distribution
The distribution of probabilities assigned to words in the language model's vocabulary.
Decoding
The process by which language models generate text using the probability distributions of their vocabulary.
Encoder-Decoder Models
Models built on the Transformer architecture focusing on embedding and text generation, where encoders convert words into vectors and decoders produce text.
Semantic Search
Utilizing encoders to find similar text based on input.
Prompting
A method to control language models by altering the input structure or providing instructions.
Training
The process of feeding text to the model to predict the next word, often done with large decoder models.
Soft Prompting
Adding parameters to the prompt that are learned by the model.
Fine Tuning
Training the model for a specific task by adjusting parameters, which can be expensive.
Decoding Techniques
Methods like Greedy Decoding, Non-Deterministic Decoding, Temperature modulation, Nucleus Sampling, and Beam Search used in text generation.
Hallucination
Generated text that is non-factual or ungrounded, which can be reduced through methods like retrieval-augmentation.
Grounded Text
Text output supported by the document, measured by models like TRUE through Natural Language Inference.
Multi-Modal Models
Models trained on various types of information like images, such as DALL-E, and can produce complex outputs simultaneously.
Language Agents
Used in sequential decision-making scenarios, extending machine learning to take actions and use tools.
OCI Generative AI Service
A managed service offering various language models for building AI applications, allowing fine-tuning and dedicated AI clusters.
Embedding Models
Models that create numerical representations of text to aid in understanding meanings, often multilingual.
Tokens
Units of text like words or parts of words used by language models, with the number of tokens per word varying based on text complexity.
Generation Models
Models like Command Model, Command Light, and Llama used for text generation and instruction following, with parameters like Maximum Output Tokens and Temperature.
Embeddings
Numerical representations of text aiding in understanding relationships between text, with methods like Cosine and Dot Product Similarity for computing similarity.
F-Strings
Used to create multiline prompts for language models, and Human Feedback for fine-tuning models to follow instructions effectively.