![](/rp/kFAqShRrnkQMbH6NYLBYoJ3lq9s.png)
deepseek-ai (DeepSeek) - Hugging Face
Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism. A unified multimodal understanding and generation model. Org profile for DeepSeek on Hugging Face, the AI community building the future.
deepseek-ai/DeepSeek-V3 · Hugging Face
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2.
DeepSeek-V3 - a deepseek-ai Collection - Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
DeepSeek V3: $5.5M Trained Model Beats GPT-4o & Llama 3.1
Jan 29, 2025 · If you prefer not to use the chat UI and want to directly work with the model, there’s an alternative for you. The model, DeepSeek-V3, has all its weights released on Hugging Face. You can access the SafeTensor files there. Model Size and Hardware Requirements:
DeepSeek-V3 Capabilities
DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.
DeepSeek-V3: the model everyone is talking about
Jan 2, 2025 · Awesome exploration of scaling test-time compute with open models by Hugging Face. "Check out this plot where the tiny 1B and 3B Llama Instruct models outperform their much larger 8B and 70B siblings on the challenging MATH …
DeepSeek-V3, ultra-large open-source AI, outperforms
Dec 26, 2024 · Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3. Available via Hugging...
DeepSeek is preparing Deep Roles and released a new V3 model
Dec 26, 2024 · Discover DeepSeek v3, the fastest and most advanced open-source language model yet. Explore its new features and hidden gems like Deep Roles on Hugging Face.
How to Run DeepSeek Models Locally in 5 Minutes? - Analytics …
Jan 29, 2025 · This family of open-source models can be accessed through Hugging Face or Ollama, while DeepSeek-R1 and DeepSeek-V3 can be directly used for inference via DeepSeek Chat. In this blog, we’ll explore DeepSeek’s model lineup and guide you through running these models using Google Colab and Ollama.
DeepSeek-V3/README.md at main · deepseek-ai/DeepSeek-V3
The total size of DeepSeek-V3 models on Hugging Face is 685B, which includes 671B of the Main Model weights and 14B of the Multi-Token Prediction (MTP) Module weights.**
- Some results have been removed