DeepSeek is a Mixture-of-Experts (MoE) language model that achieves performance comparable with other large language models , but using far less computational resources to train* and execute. It was developed in China as a response to US export restrictions on the export of high-end GPU chips, which precluded the brute-force techniques that had dominated the area.
* Note: there is some controversy as to whether DeepSeek was based on distillation from OpenAI models, which would mean its training is effectevely making use of lareg-scale and expensive processes. However, even if this turns out to be the case, it is still more computationally efficient during execution, and has certainly spured other big AI vendors to look at more efficient models.