Phi-3 mini model released under MIT

Phi-3 mini model released under MIT! 🚀 Last Week Llama 3, this week Phi-3 🤯 Microsoft Phi-3 comes in 3 different sizes: mini (3.8B), small (7B) & medium (14B). Phi-3-mini was released , claiming to match Llama 3 8B performance! 🚀

What is Phi-3 Mini?
- Phi-3 Mini is a lightweight language model with 3.8 billion parameters.
- It belongs to the Phi-3 family and comes in two variants: 4K and 128K context lengths.
Training and Fine-Tuning:
- Trained on a massive 3.3 trillion tokens.
- Fine-tuned using supervised fine-tuning (SFT) and direct preference optimization (DPO).
Performance Metrics:
- Achieves impressive performance:
  - 68.8% on MMLU (common sense, language understanding, mathematics, coding, etc.)
  - 8.38 on MT Bench (long-term context and logical reasoning).
- Outperforms models like Mistral 7B and Llama 3 8B Instruct.
Availability and Platforms:
- Available on Hugging Face and in Hugging Chat.
- Runs on Android and iPhones.
Use Cases:
- Ideal for memory/compute-constrained environments.
- Strong reasoning capabilities (especially for code, math, and logic).
Limitations:

No base model released.
English-only intent use (no multilingual support).
Lack of details on dataset mix and data/benchmark contamination

Demo: https://huggingface.co/chat/models/microsoft/Phi-3-mini-4k-instruct

Models: https://huggingface.co/models?other=phi3&sort=trending&search=microsoft

Reports: https://huggingface.co/papers/2404.14219

Tech GPT

Search This Blog

Phi-3 mini model released under MIT

Comments

Popular posts from this blog

Optimizing LLM Queries for CSV Files to Minimize Token Usage: A Beginner's Guide

Transforming Workflows with CrewAI: Harnessing the Power of Multi-Agent Collaboration for Smarter Automation

Cursor AI & Lovable Dev – Their Impact on Development