Artificial Intelligence - Page 36
AI news on generative models, ChatGPT, Gemini, OpenAI, Google DeepMind, Anthropic, xAI, NVIDIA AI hardware, and real-world breakthroughs. - Page 36
Stay Updated
Follow TweakTown for breaking tech news, reviews, and daily updates.
As an Amazon Associate, we earn from qualifying purchases. TweakTown may also earn commissions from other affiliate partners at no extra cost to you.
Microsoft confirms that you can't uninstall its controversial Windows Recall AI feature
Before launching its new Copilot+ PC range, Microsoft pulled one of its flagship Windows 11 AI features, 'Recall.' The search tool leveraged the Copilot+ NPU to search your PC history to 'recall' or find things you've looked at or worked on. An example could be asking Recall to find that funny YouTube video you watched three days ago.
The only problem was that Recall worked by constantly taking screenshots of your desktop without any security or filters. From there, the AI builds a searchable index with everything categorized without regard for privacy.
Once this became known, the controversial AI feature made headlines, and it continued to make headlines until Microsoft pulled it from the Copilot+ PC launch to retool its functionality and security features. Well, it's coming back, and you won't be able to uninstall Recall once it does.
World's most-powerful AI system has just been switched online
A few select companies in the technology sector are leading the exponentially growing push into artificial intelligence (AI) development, and Elon Musk has seemingly pushed ahead of everyone by switching on what he's calling the world's most powerful AI training system.
According to Musk, the team at xAI began working on the AI training system 122 days ago, and they were able to stand it up completely in that brief amount of time. The new system consists of 100,000 NVIDIA H100 GPUs and has been named "Colossus". Additionally, Musk revealed that Colossus isn't completely finished, as it's expected to gain another 100,00 NVIDIA H100 GPUs over the "next few months".
Musk's claim that xAI has built the "most powerful" AI training system is based on the number of GPUs or training horsepower the system has available to it, which is estimated to be more than any other system publicly revealed. Musk's new AI training model will likely be used to train X's large language model, which powers the social media platform's AI chatbot, Grok.
Continue reading: World's most-powerful AI system has just been switched online (full post)
CEO of OpenAI Japan just said an AI called 'GPT-Next' is coming, it's 100x better than GPT-4
OpenAI is close to releasing new AI models that will be 100x more powerful than GPT-4, with OpenAI Japan teasing GPT-Next, and it's 100x more powerful than GPT-4 without wasting more computing resources.
The new codenamed "Strawberry" AI model was teased alongside "Orion", with Strawberry ending up as GPT-Next and expected by the end of the year, while Orion will be unleashed in 2025. OpenAI's new GPT-Next aka Strawberry has something researchers call "System 2 thinking".
What is System 2 thinking? This will allow GPT-Next to take the time to deliberate and reason through problems, versus just predicting longer and longer sets of tokens to complete its responses. System 2 thinking has impressive results: scoring over 90% on the MATH benchmark, a collection of advanced mathematical problems, reports Reuters.
NVIDIA's suppliers in Taiwan prep for GB200 NVL36 AI servers in September, NVL72 in October
NVIDIA's various Taiwan-based systems and AI server part suppliers (cabinets, assembly, liquid cooling components, and more) expects orders for GB200 NVL36 AI servers to start in mid-September, with the higher-end GB200 NVL72 AI server cabinets expected in the second half of October.
In a new report from Ctee, we're learning that the "market generally believes" the GB200 AI server orders will be "launched as expected" with NVL36 and NVL72 AI server cabinets in mid-September, and late-October, respectively. GB200 is crucial for NVIDIA, as it's expected that out of the 5 million Blackwell chips to be produced in 2025, around 80% of them will be used in GB200.
Each of the GB200 AI GPUs cost around $30,000 each, with the GB200 Superchip (CPU + GPU) costs upwards of $70,000. NVIDIA's new NVL72 AI server cabinet costs around $3 million per AI server, filled with 72 x B200 GPUs and 36 x Grace CPUs. The NVL36 AI server cabinet features 36 x B200 AI GPUs and 16 x Grace CPUs.
Elon Musk teases Colossus: most powerful AI training system uses 100,000 NVIDIA H100 AI GPUs
Elon Musk has announced a major milestone for his AI startup, xAI, which just turned its new AI training system "Colossus" online over the weekend.
Elon tweeted: "This weekend, the xAI team brought our Colossus 100K H100 training cluster online. From start to finish, it was done in 122 days. Colossus is the most powerful AI training system in the world. Moreover, it will double in size to 200K (50K H200s) in a few months. Excellent work by the team, NVIDIA and our many partners/suppliers".
Colossus is home to 100,000 of NVIDIA's current-gen Hopper H100 AI GPUs, while Musk says that soon the most powerful AI training system in the world will have 50,000 of NVIDIA's beefed-up H200 AI GPUs (faster HBM3E memory, and more of it over the H100 AI GPU).
NVIDIA shows off its beefed-up H200 AI GPU beating AMD's just-released Instinct MI300X
NVIDIA might have its new Blackwell AI GPU architecture slowly coming out, but its Hopper H100 and new H200 AI GPUs are continuing to get even stronger with new optimizations in the CUDA stack.
The H200 and H100 AI GPUs offer leading performance across every single text compared to the competition, including the latest benchmarks like the 56 billion parameter "Mixtral 8x7B" LLM.
NVIDIA's monster HGX H200 packing 8 x Hopper H200 GPUs and NVSwitch has some strong performance gains in Llama 2 70B, with a token generation speed of 34,864 (offline) and 32,790 (server) with a 1000W and 31,303 (offline) and 30,128 (server) in the 700W config.
OpenAI's first self-developed AI chip will be made by TSMC on its brand new A16 process node
OpenAI is rumored to have its first in-house AI chip made by TSMC on its new angstrom process (A16) production lines, according to the latest rumors.
UDN is reporting that OpenAI's first self-developed AI chip will be made by TSMC on its A16 process node, with its new AI chips said to power OpenAI's text-to-video service: Sora. OpenAI's new Sora text-to-video service will be a "major selling point" of Apple's AI in the future, reports UDN.
The big boost of OpenAI's new in-house AI chip will "boost" Apple's efforts to attack the AI market, and to stimulate the development of related networking and high-speed transmission industries. The legal person who spoke with UDN said that Sora has stimulated a surge in demand for data transmission speeds, and that silicon photonics and high-speed optical modules can increase those data transmission speeds.
Microsoft lifts the lid on its new AI chip, Maia 100, up to 700W TDP, built for large-scale AI
Microsoft is finally ready to enter the custom AI hardware race, a chip market in which NVIDIA has at least a 75 percent market share. At this year's Hot Chips conference, the company unveiled its first AI accelerator, Maia 100, built on TSMC's 5nm process node.
Designed to "optimize performance and reduce costs," Maia 100's architecture includes custom server boards, racks, and software for running AI services like Microsoft's Azure OpenAI Services.
"The Maia 100 accelerator is purpose-built for a wide range of cloud-based AI workloads," Microsoft's technical blog on Maia 100 details. "The chip measures out at ~820mm2 and utilizes TSMC's N5 process with COWOS-S interposer technology. Equipped with large on-die SRAM, Maia 100's reticle-size SoC die, combined with four HBM2E die, provide a total of 1.8 terabytes per second of bandwidth and 64 gigabytes of capacity to accommodate AI-scale data handling requirements."
OpenAI announces ChatGPT growth has more than doubled since 2023
OpenAI is one of biggest companies in the world leading the push into artificial intelligence-powered surfaces, and you could certainly argue the company was the first to popularize the technology that has been used by developers in less sophisticated forms for quite some time.
Artificial intelligence-powered devices, services, and applications are popping up everywhere, and the moniker AI labeling is being slapped on seemingly every piece of technology that it can be applied to - even some that don't deserve it and are simply attempting to ride the hype surrounding AI products.
The popularity of AI can be traced back to the explosion that was the release of OpenAI's ChatGPT, in which it attracted more than 100 million monthly active users in just two months, setting the record for the fastest growing consumer-application in history. ChatGPT was released in November 2022, and according to OpenAI it's active monthly user base has grown substantially since then, attracting an additional 100 million users.
Continue reading: OpenAI announces ChatGPT growth has more than doubled since 2023 (full post)
NVIDIA to join Apple, Microsoft funding round for OpenAI that values the AI startup at $100B+
NVIDIA is thinking about joining the new funding round for OpenAI, that would value the AI startup at over $100 billion, reports The Wall Street Journal.
According to "people familiar with the matter" the new valuation would peg OpenAI at worth over $100 billion, and would see NVIDIA joining US-based tech giants Apple and Microsoft who are also reportedly in talks to participate in the funding round.
Thrive Capital would be leading the funding round, investing around $1 billion, while two sources of the WSJ said that NVIDIA discussed investing about $100 million into OpenAI. The Wall Street Journal reached out to representatives from NVIDIA, Apple, Microsoft, OpenAI, and Thrive, but all of them declined to comment (no surprise).
ChatGPT has doubled its weekly active users: from 100 million to 200 million users per week
OpenAI has announced its ChatGPT chatbot service has over 200 million weekly active users, double the 100 million users that were using ChatGPT this time last year. That's a big upgrade in users, but ChatGPT has taken over the world.
ChatGPT was launched in 2022 capable of generating human-like responses based on user prompts, and had over 100 million weekly active users, OpenAI CEO Sam Altman said in November. The AI startup said that 92% of Fortune 500 companies are using its AI-powered products, and the use of its automated Application Programming Interface (API) allowing software programs to talk to each other, has double since ChatGPT-4o mini launched in July 2024.
ChatGPT-4o mini is a cost-effective, smaller AI model that is aimed at making its AI-powered technology more affordable, and it uses less power, allowing OpenAI to target a wider range of customers.
Supermicro confirms NVIDIA B200 AI GPU delay: offers liquid-cooled H200 AI GPUs instead
Just before NVIDIA announced its issues with its new Blackwell AI GPUs, partner Supermicro seemingly confirmed Blackwell B200 AI GPUs being delayed, offering its customers liquid-cooled Hopper H200 AI GPUs in their place.
Supermicro CEO Charles Liang said that the possible delay of NVIDIA's new Blackwell GPUs for AI and HPC systems will not have a dramatic impact on AI server makers, or the AI server market. Liang said: "We heard NVIDIA may have some delay, and we treat that as a normal possibility".
Liang continued: "When they introduce a new technology, new product, [there is always a chance] there will be a push out a little bit. In this case, it pushed out a little bit. But to us, I believe we have no problem to provide the customer with a new solution like H200 liquid cooling. We have a lot of customers like that. So, although we hope better deploy in the schedule, that's good for a technology company, but this push out overall impact to us. It should be not too much".
NVIDIA says it will tweak Blackwell AI GPUs, issues with the 'GPU mask' needing B200 re-spin
Yep, NVIDIA has admitted it has had issues with its new Blackwell AI GPUs that are causing low yields, forcing the company to re-spin some of the layers of its new B200 AI GPU to boost yields.
NVIDIA said in a statement: "We executed a change to the Blackwell GPU mask to improve production yield. Blackwell production ramp is scheduled to begin in the fourth quarter and continue into fiscal 2026. In the fourth quarter, we expect to ship several billion dollars in Blackwell revenue".
The design flaws plaguing NVIDIA's new Blackwell AI GPUs hit headlines a couple of weeks ago, where we began hearing about design flaws that analyst firm KeyBanc says NVIDIA will need to "respin" the Blackwell tile that will cause a 3-month delay on shipments. Now these reports ring true. KeyBanc explained at the time: "Given the Blackwell delay, we believe NVIDIA will prioritize the ramp of B200 for hyperscalers and has effectively canceled B100, which will be replaced with a lower cost/performance GPU (B200A) targeted at enterprise customers".
AI creates a playable version of the original Doom, generating each frame in real-time
Google's research scientists have published a paper on its new GameNGen technology, an AI game engine that generates each new frame in real-time based on player input. It kind of sounds like Frame Generation gone mad in that everything is generated by AI, including visual effects, enemy movement, and more.
AI generating an entire game in real-time is impressive, even more so when GameNGen uses its tech to recreate a playable version of id Software's iconic Doom. This makes sense when you realize that getting Doom to run on lo-fi devices, high-tech gadgets, and even organic material is a right of passage.
Seeing it in action, you can see some of the issues when it comes to AI generating everything (random artifacts, weird animation), but it's important to remember that everything you see is being generated and built around you in real-time as you move, strafe, and fire shotgun blasts at demons.
AMD details Instinct MI300X MCM GPU: 192GB of HBM3 out now, MI325X with 288GB HBM3E in October
AMD's new Instinct MI300X AI accelerator with 192GB of HBM3E has had a deep dive at Hot Chips 2024 this week, as well as the company teasing its refreshed MI325X with 288GB of HBM3E later this year.
Inside, AMD's new Instinct MI300X AI Accelerator features a total of 153 billion transistors, using a mix of TSMC's new 5nm and 6nm FinFET process nodes. There are 8 chiplets that feature 4 shared engines, and each shared engine contains 10 compute units.
The entire chip packs 32 shader engines, with a total of 40 shader engines inside of a single XCD and 320 in total across the entire package. Each individual XCD has its dedicated L2 cache, and out the outskirts of the package, features the Infinity Fabric Link, 8 HBM3 IO sites, and a single PCIe Gen5 link with 128GB/sec of bandwidth that connects the MI300X to an AMD EPYC CPU.
Intel shows off its next-gen Lunar Lake, Xeon 6, Guadi 3 chips at Hot Chips 2024
Intel has announced new details on its new Xeon 6 SoC, Lunar Lake mobile processor, and Gaudi 3 AI accelerator and its OCI chiplet at Hot Chips 2024 this week.
First off, is the new Intel 6 SoC that combines the compute chiplet from Intel Xeon 6 processors with an edge-optimized I/O chiplet built on Intel 4 process technology. This enables the Xeon 6 SoC to deliver performance boosts over previous-generation Xeon CPUs with improved power efficiency and transistor density compared to previous-gen tech.
Intel will have more details about its next-gen AI PC processor, Lunar Lake, with Arik Gihon, the lead client CPU SoC architect, to talk about the new Lunar Lake CPU and how it's designed to "set a new bar for x86 power efficiency while delivering leading core, graphics and client AI performance".
IBM unveils Telum II CPU with 8 cores at 5.5GHz, Spyre AI accelerator: 300+ TOPS, 128GB LPDDR5
IBM has just unveiled its new Telum II processor and Spyre AI accelerator, which it plans to use inside of its new IBM Z mainframe systems powering AI workloads.
The company provided details of the architecture of its new Telum II processor and Spyre AI accelerator, which are designed for AI workloads on the next-gen IBM Z mainframes. The new mainframes will accelerate traditional AI workloads, as well as LLMs using a brand new ensemble method of AI.
IBM's new Telum II processor features 8 high-performance cores running at 5.5GHz, with 36MB L2 cache per core and a 40% increase in on-chip cache capacity for a total of 360MB. The virtual level-4 cache of 2.88GB per processor drawer provides a 40% increase over the previous generation. The integrated AI accelerator allows for low-latency, high-throughput in-transaction AI inferencing, for example enhancing fraud detection during financial transactions, and provides a fourfold increase in compute capacity per chip over the previous generation.
SK hynix's next-gen HBM4 tape out in October: ready for NVIDIA's future-gen Rubin R100 AI GPU
SK hynix is aiming to have its HBM4 memory tape-out in Q4 2024, ready for NVIDIA's next-gen Rubin R100 AI GPU coming in 2025.
In a new report from ZDnet, we're learning that SK hynix is nearing the final stage of commercializing its next-generation HBM4 memory, with the design drawings to be transferred to the manufacturing process, or "tape out". According to ZDnet's industry sources, SK hynix plans to complete the tape out of its HBM4 for NVIDIA in October, so we're just weeks away.
HBM4 offers a huge 40% increase in bandwidth, and a reduced power consumption of a rather incredible 70% to HBM3E, the fastest memory in the world. HBM4 density will be 1.3x higher, with all of these advancements combined, the leap in performance and efficiency is a key driver in NVIDIA's continued AI GPU dominance.
TSMC to make $31 billion in 9 months from its 3nm and 5nm process nodes alone
TSMC is expected to make over NT $1 trillion (around $31 billion USD or so) in revenue from its 3nm and 5nm process nodes, in just a span of 9 months.
DigiTimes reports that TSMC will generate around $31 billion from just two of its high-end semiconductor nodes, thanks to their unstoppable demand -- customers like Apple, AMD, NVIDIA, Intel, Qualcomm, MediaTek -- with TSMC seeing huge revenue increases for Q2 2024 to NT$336.7 billion, or around 40% of their total revenue in Q1 2023.
TSMC estimates it will generate NT $754 billion (around $23 billion USD or so) from its 3nm and 5nm process nodes in Q3 2024, with major customers in Apple and NVIDIA.
This data center AI chip roadmap shows NVIDIA will dominate far into 2027 and beyond
In a recently shared data center AI chip roadmap posted on X, we get a good look at what companies have on the market already, and what's in the AI chip pipeline through to 2027. Check it out:
The list includes chip makers NVIDIA, AMD, Intel, Google, Amazon, Microsoft, Meta, ByteDance, and Huawei. You can see the list of NVIDIA AI GPUs includes the Ampere A100 through to the Hopper H100, GH200, H200 AI GPUs, and into the Blackwell B200A, B200 Ultra, GB200 Ultra and GB200A. But after that -- which we all know is coming -- is Rubin and Rubin Ultra, both rocking next-gen HBM4 memory.
We also have AMD's growing line of Instinct MI series AI accelerators, with the MI250X through to the new MI350 and the upcoming MI400 listed in there for 2026 and beyond.






















