AliDropship is the best solution for dropshipping

Inception, a groundbreaking startup located in Palo Alto and founded by Stanford computer science professor Stefano Ermon, reports that it has created an innovative AI model utilizing “diffusion” technology. This model is referred to as a diffusion-based large language model, or simply “DLM.”

Currently, the spotlight in generative AI is on two primary categories: large language models (LLMs) and diffusion models. LLMs, which rely on the transformer architecture, are designed primarily for text generation. On the other hand, diffusion models, the driving force behind platforms like Midjourney and OpenAI’s Sora, excel in producing images, video, and audio.

According to Inception, their model combines the advantages of conventional LLMs—such as code generation and answering queries—with dramatically improved speed and lower computational costs.

Stefano Ermon shared with TechCrunch that he has long been investigating the application of diffusion models to textual data in his Stanford lab. His findings indicate that traditional LLMs tend to operate at a slower pace compared to those utilizing diffusion technology.

He explained that LLMs generate words in a sequential manner: “You cannot produce the second word until the first is generated, and you can’t create the third until the first two exist.”

Ermon sought to implement a diffusion technique for text generation because, unlike LLMs that process data sequentially, diffusion models begin with an approximate representation of the data (like a rough image) and refine it into a clear form in one go.

His hypothesis was that parallel generation and modification of extensive text blocks using diffusion models could be feasible. After years of experimentation, Ermon, along with a student, achieved a significant milestone, which was documented in a research paper published last year.

Recognizing the implications of this advancement, Ermon launched Inception last summer, enlisting the help of two former students—UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov—to co-lead the venture.

While details regarding Inception’s funding remain private, sources indicate that the Mayfield Fund has made an investment.

The company has already attracted several clients, including undisclosed Fortune 100 companies, by catering to their urgent demand for enhanced AI speed and reduced latency, according to Ermon.

“Our models can efficiently utilize GPUs,” Ermon noted, referencing the standard chips used for running AI models in production. “I believe this is a game changer. It’s set to transform how language models are developed.”

Inception provides an API along with options for on-premises and edge device deployments, model fine-tuning support, and a range of ready-to-use DLMs tailored to various applications. The company asserts that its DLMs can operate up to 10 times faster than traditional LLMs while incurring one-tenth of the costs.

A representative from the company stated, “Our ‘small’ coding model matches the performance of [OpenAI’s] GPT-4o mini while being over 10 times quicker.” They further claimed that their ‘mini’ model outperforms smaller open-source alternatives like [Meta’s] Llama 3.1 8B, achieving speeds exceeding 1,000 tokens per second.

In the AI field, “tokens” refer to segments of raw data. If Inception’s assertions hold true, delivering a rate of 1,000 tokens per second is indeed remarkable.

Source link

Sell anywhere with AliDropship