DeepSeek技術:深度學習與應用創新 DeepSeek Technology: Deep Learning and Application Innovation



#deepseek #llm #gpu #模型蒸留

Deepseek為一開源模型,透過開源讓技術加速成熟,其另一大特點是軟體效能的優化,藉由蒸留大模型以及MOE,減輕該模型的算力負擔。
模型蒸餾:是一種機器學習技術,將大型、複雜的「教師模型」(Teacher Model)的知識轉移到小型、效率更高的「學生模型」(Student Model)。目標是讓學生模型保留教師模型的性能,同時降低計算成本與資源需求。
MoE係將模型分為多個「專家」(Experts),每個專家專精於特定任務或數據子集。推理時,透過動態路由(Routing)僅啟動部分專家,降低計算負擔。

更多關於Deepseek的介紹,就讓半導體專家 林嘉洤 教授,和金融行銷專家 馬瑞辰 教授 在影片中告訴你!

DeepSeek is an open-source model that accelerates technological maturity through its open-source nature. Another significant feature is its optimization of software performance, utilizing model distillation and Mixture of Experts (MoE) to reduce the computational burden of the model.

Model Distillation: This is a machine learning technique that transfers knowledge from a large, complex “Teacher Model” to a smaller, more efficient “Student Model.” The goal is to enable the student model to retain the teacher model’s performance while reducing computational costs and resource demands.

MoE: This approach divides the model into multiple “Experts,” each specializing in specific tasks or data subsets. During inference, dynamic routing activates only a subset of experts, thereby reducing the computational load.

For more information about DeepSeek, let semiconductor expert Professor Lin Chia-wei and financial marketing expert Professor Ma Jui-chen explain it to you in the video!

【發財二極體相關平台】
YouTube頻道:
FB粉絲頁:

【聯絡我們】
email: service@my-galaxy.com.tw

source

Disclaimer
The content published on this page is sourced from external platforms, including YouTube. We do not own or claim any rights to the videos embedded here. All videos remain the property of their respective creators and are shared for informational and educational purposes only.

If you are the copyright owner of any video and wish to have it removed, please contact us, and we will take the necessary action promptly.

Scroll to Top