DeepSeek: Everything About The AI Chatbot App

Category :

AI

Posted On :

Share This :

 

Origins Of DeepSeek As A Merchant

High-Flyer Capital Management, a Chinese quantitative hedge fund that leverages AI to guide its trading choices, supports DeepSeek.

In 2015, Liang Wenfeng, an AI enthusiast, co-founded High-Flyer. According to reports, Wenfeng started experimenting with trading while attending Zhejiang University. In 2019, he established High-Flyer Capital Management, a hedge fund dedicated to creating and implementing AI algorithms.

 

DeepSeek was established by High-Flyer in 2023 as a lab devoted to studying AI tools apart from its financial operations. The lab split out into its own business, DeepSeek, with High-Flyer as one of its investors.

DeepSeek created its own data center clusters for model training right away. However, DeepSeek has been impacted by U.S. hardware export restrictions, just like other AI firms in China. The company was compelled to employ Nvidia H800 processors, a less potent variant of the H100 chip that is accessible to American businesses, in order to train one of its more current models.

 

It is stated that the technological staff at DeepSeek is primarily young. According to reports, the corporation actively seeks out PhD AI researchers from prestigious Chinese universities. According to The New York Times, DeepSeek also employs non-computer scientists to help its tech better understand a variety of topics.

 

The Robust Models Of DeepSeek

In November 2023, DeepSeek released its initial set of models, which included DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat. However, the AI industry didn’t start paying attention until this spring, when the startup unveiled its next-generation DeepSeek-V2 family of models.

In addition to doing well on a number of AI benchmarks, DeepSeek-V2, a general-purpose text and picture analysis system, was far less expensive to operate than similar models at the time. It compelled ByteDance and Alibaba, two of DeepSeek’s domestic rivals, to lower the usage fees for some of their models and make others totally free.

 

The December 2024 release of DeepSeek-V3 only increased DeepSeek’s reputation.

DeepSeek V3 performs better than both “closed” models that are only accessible via an API, such as OpenAI’s GPT-4o, and downloadable, publicly available models, such as Meta’s Llama, according to DeepSeek’s internal benchmark testing.

The R1 “reasoning” model of DeepSeek is equally outstanding. According to DeepSeek’s January release, R1 outperforms OpenAI’s o1 model on important metrics.

 

R1 successfully fact-checks itself since it is a reasoning model, which helps it stay clear of some of the common mistakes that models make. In comparison to a standard non-reasoning model, reasoning models typically take a little longer to arrive at solutions, ranging from seconds to minutes. On the plus side, they are typically more trustworthy in fields like math, science, and physics.

 

However, R1, DeepSeek V3, and the other DeepSeek models have drawbacks. Since the AI was created in China, China’s internet regulator is able to benchmark it to make sure that its responses “embody core socialist values.” For instance, R1 in DeepSeek’s chatbot software won’t respond to inquiries concerning Taiwan’s autonomy or Tiananmen Square.

 

DeepSeek received more than 16.5 million visits in March. According to David Carr, editor at Similarweb, “[F]or March, DeepSeek is in second place, despite seeing traffic drop 25% from where it was in February,” he told TechCrunch. ChatGPT, which surpassed 500 million weekly active users in March, is still far superior.

An improved version of DeepSeek’s R1 reasoning AI model was made available on the Hugging Face developer platform in May.

 

A Method That Is Disruptive

It’s unclear exactly what DeepSeek’s business model is, if it has one. The business offers certain of its goods and services for free while pricing others far below market value. Even though there is a lot of VC interest, it is not accepting investor funds.

According to DeepSeek, it has been able to sustain exceptional cost competitiveness through efficiency advancements. However, several experts contest the numbers provided by the corporation.

 

In any event, developers have embraced DeepSeek’s models, which are accessible under permissive licenses that permit commercial use but aren’t open source in the traditional sense of the word. Clem Delangue, the CEO of Hugging Face, one of the platforms that houses DeepSeek’s models, claims that over 500 “derivative” models of R1 have been developed on Hugging Face and have received a total of 2.5 million downloads.

 

DeepSeek’s triumph over bigger and more well-established competitors has been characterized as “over-hyped” and “upending AI.” The company’s performance was at least partially to blame for the 18% decline in Nvidia’s stock price in January and for prompting OpenAI CEO Sam Altman to address the public. According to Reuters, U.S. Commerce department bureaus informed employees in March that DeepSeek would not be allowed on their official devices.

 

Microsoft declared that DeepSeek is accessible through its Azure AI Foundry service, which is a platform that unifies enterprise AI services under one roof. CEO Mark Zuckerberg stated that investing in AI infrastructure will remain a “strategic advantage” for Meta when questioned about DeepSeek’s effect on the company’s AI expenditure during its first-quarter earnings call. OpenAI referred to DeepSeek as “state-subsidized” and “state-controlled” in March and suggested that the US government look into outlawing DeepSeek models.

 

CEO Jensen Huang highlighted DeepSeek’s “excellent innovation” on the company’s fourth-quarter results call, stating that it and other “reasoning” models are ideal for Nvidia due to their significant computational needs.

Meanwhile, some businesses, as well as entire nations and governments, including South Korea, are outlawing DeepSeek. Additionally, DeepSeek was prohibited from being utilized on government equipment in New York State.

 

During a Senate hearing in May, Microsoft Vice Chairman and President Brad Smith stated that DeepSeek is not permitted for use by Microsoft workers because of data security and propaganda issues.

It’s unclear what the future holds for DeepSeek. Better models are inevitable. However, it seems that the U.S. administration is becoming more cautious about what it considers to be detrimental foreign influence. The Wall Street Journal stated in March that DeepSeek will probably not be allowed on government equipment in the United States.