Deepseek V3 Hugging Face

About 290,000 results

Open links in new tab

Any time

huggingface.co
https://huggingface.co › deepseek-ai
deepseek-ai (DeepSeek) - Hugging Face
Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism. A unified multimodal understanding and generation model. Org profile for DeepSeek on Hugging Face, the AI community building the future.
huggingface.co
https://huggingface.co › deepseek-ai
deepseek-ai/DeepSeek-V3 · Hugging Face
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2.
huggingface.co
https://huggingface.co › collections › deepseek-ai
DeepSeek-V3 - a deepseek-ai Collection - Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
analyticsvidhya.com
https://www.analyticsvidhya.com › blog
DeepSeek V3: $5.5M Trained Model Beats GPT-4o & Llama 3.1
Jan 29, 2025 · If you prefer not to use the chat UI and want to directly work with the model, there’s an alternative for you. The model, DeepSeek-V3, has all its weights released on Hugging Face. You can access the SafeTensor files there. Model Size and Hardware Requirements:
deepseek.com
https://www.deepseek.com
DeepSeek-V3 Capabilities
DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.
codingwithintelligence.com
https://codingwithintelligence.com
DeepSeek-V3: the model everyone is talking about
Jan 2, 2025 · Awesome exploration of scaling test-time compute with open models by Hugging Face. "Check out this plot where the tiny 1B and 3B Llama Instruct models outperform their much larger 8B and 70B siblings on the challenging MATH …
venturebeat.com
https://venturebeat.com › ai
DeepSeek-V3, ultra-large open-source AI, outperforms
Dec 26, 2024 · Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3. Available via Hugging...
testingcatalog.com
https://www.testingcatalog.com › deepseek-preparing...
DeepSeek is preparing Deep Roles and released a new V3 model
Dec 26, 2024 · Discover DeepSeek v3, the fastest and most advanced open-source language model yet. Explore its new features and hidden gems like Deep Roles on Hugging Face.
analyticsvidhya.com
https://www.analyticsvidhya.com › blog › run-deepseek-models-locally
How to Run DeepSeek Models Locally in 5 Minutes? - Analytics …
Jan 29, 2025 · This family of open-source models can be accessed through Hugging Face or Ollama, while DeepSeek-R1 and DeepSeek-V3 can be directly used for inference via DeepSeek Chat. In this blog, we’ll explore DeepSeek’s model lineup and guide you through running these models using Google Colab and Ollama.
github.com
https://github.com › deepseek-ai › blob › main › README.md
DeepSeek-V3/README.md at main · deepseek-ai/DeepSeek-V3
The total size of DeepSeek-V3 models on Hugging Face is 685B, which includes 671B of the Main Model weights and 14B of the Multi-Token Prediction (MTP) Module weights.**
Some results have been removed
Pagination
- 1
- 2
- 3
- 4
- Next

deepseek-ai (DeepSeek) - Hugging Face

deepseek-ai/DeepSeek-V3 · Hugging Face

DeepSeek-V3 - a deepseek-ai Collection - Hugging Face

DeepSeek V3: $5.5M Trained Model Beats GPT-4o & Llama 3.1

DeepSeek-V3 Capabilities

DeepSeek-V3: the model everyone is talking about

DeepSeek-V3, ultra-large open-source AI, outperforms

DeepSeek is preparing Deep Roles and released a new V3 model

How to Run DeepSeek Models Locally in 5 Minutes? - Analytics …

DeepSeek-V3/README.md at main · deepseek-ai/DeepSeek-V3