Tuesday, April 30, 2024

Phi-3 mini model released under MIT

Phi-3 mini model released under MIT! 🚀 Last Week Llama 3, this week Phi-3 🤯 Microsoft Phi-3 comes in 3 different sizes: mini (3.8B), small (7B) & medium (14B). Phi-3-mini was released , claiming to match Llama 3 8B performance! 🚀

  1. What is Phi-3 Mini?

    • Phi-3 Mini is a lightweight language model with 3.8 billion parameters.
    • It belongs to the Phi-3 family and comes in two variants: 4K and 128K context lengths.
  2. Training and Fine-Tuning:

    • Trained on a massive 3.3 trillion tokens.
    • Fine-tuned using supervised fine-tuning (SFT) and direct preference optimization (DPO).
  3. Performance Metrics:

    • Achieves impressive performance:
      • 68.8% on MMLU (common sense, language understanding, mathematics, coding, etc.)
      • 8.38 on MT Bench (long-term context and logical reasoning).
    • Outperforms models like Mistral 7B and Llama 3 8B Instruct.
  4. Availability and Platforms:

    • Available on Hugging Face and in Hugging Chat.
    • Runs on Android and iPhones.
  5. Use Cases:

    • Ideal for memory/compute-constrained environments.
    • Strong reasoning capabilities (especially for code, math, and logic).
  6. Limitations:

    • No base model released.
    • English-only intent use (no multilingual support).
    • Lack of details on dataset mix and data/benchmark contamination





No comments:

AI's Impact on the IT Industry 2026