DeepSeek vs. OpenAI: Is This Really a Sputnik Moment? | Chaos Lever

Can you feel the enthusiasm radiating from this episode? No? Well, Chris is already singing *The Lion King*, so we’re off to a strong start. Today, we’re diving headfirst into the world of AI with DeepSeek, the latest generative AI model out of China that’s supposedly shaking up Silicon Valley. Is it really the *Sputnik moment* some are claiming, or is it just another overhyped step forward? Spoiler: It’s not Sputnik.
We break down DeepSeek’s origins, its connection to a high-frequency trading hedge fund, and why its *free* and *open-source* nature might not be as open as it seems. Oh, and censorship—lots of censorship. But don’t worry, you can always trick it by asking questions in Pig Latin. Meanwhile, Chris did some highly scientific testing (read: he asked it a Bible question), and we debate whether reasoning transparency is a *game-changer* or just a fancy parlor trick.
Is DeepSeek a technical marvel? Yeah, kinda. Is it revolutionary? Not really. Is it 96% cheaper than OpenAI’s API? Absolutely. And *that* is what has Silicon Valley panicking. We also talk about the Wright brothers, the Cold War, and how local AI models might just burn a hole in your couch. Good times all around.
---
🔗 **LINKS**
- https://www.deepseek.com
- https://arxiv.org/abs/2408.14158
- https://www.promptfoo.dev/blog/deepseek-censorship/
- https://erichartford.com/uncensored-models
- https://www.wired.com/story/deepseeks-ai-jailbreak-prompt-injection-attacks/
- https://proton.me/blog/deepseek
00:00 - - The boundless enthusiasm of Ned
00:58 - - The Batman is (probably) not for kids
03:08 - - DeepSeek enters the chat
04:22 - - What’s so special about DeepSeek?
06:45 - - *Sputnik moment* explained for the kids
10:29 - - High-frequency trading meets AI
18:54 - - Censorship: It’s a thing
22:10 - - Testing DeepSeek’s reasoning abilities
29:03 - - Running AI models at home: Great idea or terrible mistake?
38:55 - - Why OpenAI should be *very* nervous
40:48 - - This is not a Sputnik moment (Mark Andreessen, please stop talking)
[00:00:00.00]
Ned: Can you feel the level of excitement and enthusiasm that I'm transmitting across this microphone to you?
[00:00:08.17]
Chris: I can feel the love tonight.
[00:00:12.26]
Ned: Simmer down there. Elton John. Nobody asked your opinion. Not even Elton John's best song.
[00:00:21.01]
Chris: No, it's not even close to his best song. Come on. No.
[00:00:24.06]
Ned: There's that song from the Tarzans soundtrack. Hello, Legend Human, and welcome to the Chaos Lever podcast. My name is Ned, and I'm definitely not a robot. I'm a real human person who enjoys Elton John and the biopic. That was really good. You agree, too, don't you? We're all happy with that movie. With me is Chris, who is also happy with that movie.
[00:00:58.14]
Chris: I mean, it was better than most biopics.
[00:01:02.08]
Ned: Yeah, that's fair. I certainly enjoyed it more than the Freddie Mercury one.
[00:01:06.20]
Chris: You mean the Queen one?
[00:01:10.28]
Ned: Let's be serious. There's one character in that movie.
[00:01:16.07]
Chris: Right. I mean, in my mind, I'm trying to think. Now you got me all confused. I thought that the Johnny Cash one was pretty solid, although when I looked into the history, there's a a lot of, let's call it, optimistic reappraisal of that particular story.
[00:01:35.19]
Ned: Isn't that always the case? That's always the case, especially when the people who were there are involved in the writing of the movie. Because they're going to have an agenda, positive or negative.
[00:01:48.27]
Chris: I guess in terms of biopics, the music ones are almost always pointless. Basketball Diaries is pretty intense.
[00:01:56.21]
Ned: Haven't seen it.
[00:01:57.27]
Chris: You wouldn't. It's not rated PG. When was the last time you were allowed to watch an adult movie?
[00:02:04.19]
Ned: Actually, so this is funny. I watched the Batman yesterday because I hadn't actually watched it yet. Though I think I put it on once and immediately fell asleep. And so I have fragments of it, and I thought I'd actually watch to the movie, but it turns out that I'd woken up for one half of a scene during a funeral and then fallen back asleep.
[00:02:27.21]
Chris: It's very different.
[00:02:29.22]
Ned: It It's very different. I think I like it, and I'm going to try to watch the Penguin series now.
[00:02:37.19]
Chris: Definitely fun for the whole family. Bring the kids.
[00:02:40.21]
Ned: Yes, it does seem like a family affair, like something my eight-year-old could really sink her teeth into.
[00:02:47.00]
Chris: Right before bed, ideally.
[00:02:51.01]
Ned: Honey, do you like violence? Now I'm slim shady.
[00:02:55.12]
Chris: How long can you cry?
[00:02:59.21]
Ned: Oh, Oh, that... Wow. Let's talk about something else. Let's talk about some tech garbage, shall we?
[00:03:08.14]
Chris: Let's talk about Deep Seek. So if you've Actually, even if you haven't been following the news, if you're alive and you have an electronic, or you've heard of electronics, or you're made up of electrons, I'm sure you have heard some of the astonishing claims about a new competitive generative AI model called DeepSeq. Deepseq R1, V3. We'll get into it.
[00:03:41.23]
Ned: Yes.
[00:03:42.18]
Chris: The names could be better. The claims are impressive. Allegedly, it only cost a couple of million dollars to develop. It was done as a side project by some guy who apparently figured out how to teach AI to use reinforcement learning instead of the usual supervised fine-tuning. And just so happens that the model he spit out on a long weekend outperforms some of the most advanced LLMs out there, such as ChatGPT's '01 line. Oh, And Deep Seq is free to use, and it's open source. And they have an app. Thank God.
[00:04:22.20]
Ned: Oh, well.
[00:04:23.25]
Chris: What's not to love?
[00:04:28.19]
Ned: Well, I feel like we're going to get into that.
[00:04:31.23]
Chris: Mark Andreessen called Deep Seek's release AI's, quote, Sputnik moment, which people probably don't love, especially people in Silicon Valley.
[00:04:42.19]
Ned: Yeah.
[00:04:42.27]
Chris: Also people who are listening to this because the comment made the headlines, and thus I was forced to namecheck Mark Andreessen.
[00:04:50.19]
Ned: We apologize to everyone listening that we had to bring that piece of shit up, but we did.
[00:04:57.25]
Chris: I did an uncomfortable amount of side research on him, and every article I read was worse than the last.
[00:05:03.20]
Ned: It's not a good look.
[00:05:06.13]
Chris: Anyway, let's take a look at this deep seek thing.
[00:05:09.20]
Ned: Before we do that, I want to back up about the Sputnik moment thing because he's of a certain age, so that's meaningful to him. But for some of our audience, that might not mean anything, and it barely means anything to me.
[00:05:28.01]
Chris: That's a fair point. Okay, so In the '50s, we had this thing called the Cold War. We being the United States, the bad guys being USSR or Russia. It meant something different in Russian. But it's the communists, the red scare, all that stuff was happening. We created the space race, the idea being we are going to create all this crazy technology. We're going to get people into space. Eventually, we're going to get to the moon. And most importantly, we're going to do it before the Russians do.
[00:06:00.19]
Ned: Yes.
[00:06:01.26]
Chris: The Sputnik moment happened in, I think, 1957, when absolutely out of nowhere, and to the surprise of literally everyone, the Russians put a satellite in space. That satellite was called Sputnik. That satellite was functionally useless. However, it was up there.
[00:06:20.29]
Ned: And it scared the shit out of everyone. And we could see that it was up there. Yes.
[00:06:24.18]
Chris: It scared the shit out of everyone. It also reinvigorated NASA and the government and the US population as a whole to take this threat a lot more seriously. Long story short, we got to the moon first. There's a flag there. There's a golf ball.
[00:06:45.28]
Ned: And other assorted trash that we left behind. There's a rover and the rover is not trash.
[00:06:53.27]
Chris: Anyway, I will circle back to that at the end because I have thoughts.
[00:06:57.27]
Ned: Okay.
[00:06:58.15]
Chris: All right. But long story short, that's the Sputnik moment that he is referencing. Right.
[00:07:03.15]
Ned: We were chugging along. We thought we were in the lead. Then out of freaking nowhere, a Communist country comes out with the technology that shows up the best of what we have to offer, theoretically.
[00:07:19.04]
Chris: Now, let's fast forward back to close to now.
[00:07:23.18]
Ned: What happened to now, now?
[00:07:25.10]
Chris: We missed it.
[00:07:27.21]
Ned: When?
[00:07:28.18]
Chris: I don't remember. Okay.
[00:07:31.13]
Ned: We'll stop with the quotes.
[00:07:34.12]
Chris: So Deep Seek. Deep Seek, first, is the name of a Chinese company. This company was founded in May of 2023. The owner of said company is one Leng Wangfeng. Apologies for any and all pronunciation errors, of which they will be probably 100 %. Wang is the co founder of a different company, a quantitative hedge fund called High Flyer. High Flyer has been around since 2016 when Lian and a bunch of his math nerd buddies got out of college, and they have had somewhere between $5 and $9 billion under management, depending on the year.
[00:08:21.18]
Ned: All right.
[00:08:23.07]
Chris: It's speculative stock trading. It's a hedge fund.
[00:08:26.22]
Ned: Okay.
[00:08:27.29]
Chris: High Flyer had, since since their beginning, been interested in computer-assisted modeling and machine learning in order to trade on the stock market. In 2019, High Flyer began looking at using AI in their stock market modeling. Okay. 2019, you'll remember, was actually a few years before ChatGPT happened.
[00:08:53.07]
Ned: Sure.
[00:08:54.07]
Chris: Ai, you'll hopefully remember, is more than just ChatGPT.
[00:09:01.01]
Ned: That's probably an important point to hammer on is we've gotten very obsessed with LLMs because they're the public face of what people really can consume. But AI and machine learning is much older than that, and there are many other categories of it.
[00:09:16.22]
Chris: Yes. And many of them can accurately tell you how many R's there are in strawberry.
[00:09:23.01]
Ned: Yeah, valid point.
[00:09:26.07]
Chris: Now, in 2019, the AI in high flyers case was tightly focused. It was not general anything. It was specific to trading strategies, and they were not messing around. In 2020, they built a supercom that was called the Fireflyer or the Fireflyer One, depending on the publication, at the cost of something like 200 million Taiwan or around 28 million US dollars. All financial numbers are, of course, estimates. This is a private company, and in certain cases, we're just fucking guessing.
[00:10:01.25]
Ned: Fair enough.
[00:10:02.19]
Chris: This system, Fireflyer One, was replaced a year or two later by Fireflyer Two, which cost a lot more money, something like 1 billion Taiwan. This one was powered by 10,000 Nvidia A100 GPUs. Now, this was the absolute state of the art at the time, and remember, the embargo on sending AI chips to China did not start yet.
[00:10:29.26]
Ned: Right. So they were able to get these chips in the clear. They didn't have to go through some other machinations or buy the nerfed chips, which I'm sure you'll get to.
[00:10:39.19]
Chris: Right. For this particular use case, that's 100% true, and there's no reason to doubt that question. Around 2023, when ChatGPT was out and AI was starting its hyperbolic rise to stardom, High Flyer realized that they had an opportunity on their hands, and it wasn't one that needed to be tied to the stock market. They spun out Deep Seek. New company, brand new. Now, I say spun out, but in reality, it does seem like the two are deeply connected, both from a human and computing resource availability perspective.
[00:11:21.25]
Ned: Sure. Maybe a separate legal entity, but for all intents and purposes, it's probably mostly the same people involved.
[00:11:30.11]
Chris: I only drag this point out because I really want to make it absolutely clear that the mainstream media has been a bullshit artist on this one point. This is absolutely not a side project. This was a concerted effort by a deeply capable team that had already had years of experience working on AI on machines worth hundreds of millions of dollars.
[00:11:55.28]
Ned: Right. This wasn't two guys that cook something up in their garage.
[00:11:59.11]
Chris: Correct. Now, to be fair, these are no joke scientists. They worked on different mechanisms to take advantage of the hardware that was available to them. Around this time, that embargo was put in place, and you could not send the most powerful GPUs to China. Otherwise, you would have a SAD.
[00:12:20.16]
Ned: Yes.
[00:12:22.13]
Chris: Which is the government raiding your place, they shut down your business, and negative things happen. You Usually end with the word suicide and involve three or four bullet wounds.
[00:12:34.14]
Ned: That got dark.
[00:12:36.16]
Chris: I don't want to talk about it. Okay. Now, in terms of how deep sea got this performance from, admittedly not as much available material, they published a paper showing that the Fireflyer 2, with its A100s, which were at this point several years old, were achieving performance, approximating the DGX A100 while reducing costs by half and energy consumption by 40%. This was just the start, and what they decided to do next was build an LLM.
[00:13:17.29]
Ned: Okay.
[00:13:20.10]
Chris: The things you do is a side project.
[00:13:23.22]
Ned: It doesn't sound like a side project. None of this does.
[00:13:27.18]
Chris: Now, in terms of the actual training, they used what they call reinforcement Learning. Now, we have to use the phrase, They tell us, because we actually have no idea what actually went into the training. Again, this is a private company. They do not share their training data. They're certainly not going to share the recipe for how they did the training itself. This is like the recipe to WD-40. You don't want everyone to know how you do it. Fair enough. The idea, though, is that the AI is given essentially unlabeled data and allowing the model to reason to determine the correct answer. This is in contrast to more structured training like OpenAI has been using, where you have pre-labeled examples to sort sets of understanding what's good and what's bad based on the tagging.
[00:14:19.02]
Ned: Okay.
[00:14:19.19]
Chris: Now, incidentally, this is a lot of how spam filtering works. The data set that Google uses is sourced by all of us. When we right-click on an email and say, This is spam. Google gets several hundred billion of those emails, and their AI model can get a pretty good idea to start guessing on its own. The more guessing it does, the more accurate it gets because the humans are self-reinforcing the service This blah, blah, blah, blah, blah, blah. The other kind, this reinforcement learning doesn't require that thing. It should allegedly take longer to get to the right answer, but it should be a more solid reasoned answer. That reasoning is a big selling point of Deep Seek's model. If you use it and you click the little expandy button, you can actually see Deep Seek doing this reasoning. It comes out and there's a lot of text, and it's actually pretty verbose and annoying, but it shows you exactly how it got to its answer. Now, I honestly cannot tell if this is real in terms of its reasoning in real-time and all that, or if it's just some theater to make it look cool.
[00:15:34.16]
Chris: But if you ask it a complicated question, it will think and go through ideas for a while. This gets us to the first downside of Deep Seq R1. Oh, I forgot about that part. Deep Seq R1 is the name of the actual model.
[00:15:52.18]
Ned: Okay.
[00:15:52.29]
Chris: Deep Seq V3 is the LLM component that you can have chatbot communications with. Obviously, V3 It lies on R1, but it's a different product. If you ask the same exact question of V3 or R1, and you can actually access them both directly, they're both free on the website. One thing you'll see is Deep Seq R1 takes a lot longer because it's thinking.
[00:16:16.22]
Ned: Okay, so R for reasoning, if we want to think of it that way.
[00:16:20.28]
Chris: Yeah. It does show impressive performance in certain areas like math and coding, in both cases outperforming with its solid similar models, but it's slow. It's like 5 to 10 times slower than OpenAI.
[00:16:39.19]
Ned: Okay.
[00:16:40.21]
Chris: Now, getting away from those types of things where like math and coding, you have an absolutely obvious correct answer. But you ask it a more vague question that requires research and collating sources and all that type of thing. It's curious to see what happens. I asked it a fairly complex question. What are the consequences of biblical translation errors or controversies in the text on the modern understanding of Christian theology?
[00:17:09.14]
Ned: Okay, that's a nuanced, difficult question to answer. That's the thesis statement.
[00:17:17.16]
Chris: Yeah.
[00:17:18.24]
Ned: Or thesis question. It's where you would start writing a whole book.
[00:17:23.14]
Chris: Anyway, I asked it of the major three. In this case, I'm the major three, what I mean is Deep Seek, because like it or not, it's number one at the moment, ChatGPT01 and Anthropics Claude.
[00:17:38.27]
Ned: Okay.
[00:17:40.09]
Chris: And Deep Seek took 7-10 times longer to answer that question. Wow. It did, however, also give the most substantive answer, particularly when you factor in the information it sifted through in its reasoning period. That's interesting because it reasoned more than it gave me in the answer. I think that that's quite unusual.
[00:18:08.22]
Ned: You can see that reasoning if you want to expand it. If you want to see how it reached the conclusion or the answer it gave you, you can trace back through, and maybe you disagree with some of the reasoning process, but at least you can see where it's coming from.
[00:18:24.08]
Chris: Correct.
[00:18:25.05]
Ned: I don't think that level of detail is available on the other I believe that either the most recent or the upcoming release from OpenAI is going to do this.
[00:18:38.17]
Chris: It does not do it right now. Now, that's all pretty cool. But it gets us to another problem about Deep Seek that is making the news as well, and that is the model is censored to all hell.
[00:18:54.28]
Ned: Not surprising.
[00:18:57.04]
Chris: No. There is absolutely no question question that Deep Seek is censored based on the standard party line of the CCP, the government of China. If you ask it a question that violates these standards, it will say something along the lines of, Sorry, that's beyond my scope. Let's talk about something else.
[00:19:19.01]
Ned: I see.
[00:19:22.04]
Chris: Incidentally, it happens no matter where you ask it. If you ask it on the web, if you ask on the app, or if you ask of one of the downloaded models. Now, we'll get to that last point in a minute.
[00:19:35.24]
Ned: Okay.
[00:19:36.18]
Chris: One thing that is funny, though, the way that this censorship works, let's say it's not 100% as baked as it could be. If you ask it a provocative question and observe the reasoning, like I said, you can watch it reasoning. You can see what it's thinking, air quotes. You can get information out of it. This makes the censorship trivial which is trivially bypassable at a programmatic level. Because if you watch it on the website, the web page can delete its reasoning. If it sent it down the API pipe, it can't unsend it.
[00:20:11.16]
Ned: You have it. You can log it. It's yours.
[00:20:14.27]
Chris: Which is interesting. I found one article that went into the censorship issue in-depth that posits that the Deep Seek team basically did the bare minimum to satisfy the government's demands. And that's it.
[00:20:31.27]
Ned: Yeah. Anything more would be more work on their part. They're going to do the bare minimum to, like you said, line up with what the government wants, but they got other stuff to do. Right.
[00:20:47.06]
Chris: There are a lot of websites that say that the answers Deep Seat gives tightly toes the Communist China Party line and gives factually incorrect answers in ratios exceeding other chat bots when it comes especially to recent history, politics, and news of the day. Interestingly, with my testing, N equals one, I am finding results that are consistent at best. A few of the got you prompts I've seen on Twitter and online and websites that gave bad answers were fine. Your mileage may vary on this issue on the non-objectly China-related questions. If they are specific to Chinese hot buttons, however, it's crystal clear what's happening. Additionally, I also wonder if it's geography-based. I'm doing my searches from the Philadelphia area and getting neutral responses. Maybe the propaganda is stronger elsewhere.
[00:21:56.05]
Ned: It would be interesting to get on a VPN and run the same questions from somewhere in Europe or somewhere in East Asia and just see how the responses may change depending on where it perceives your location.
[00:22:10.14]
Chris: Yeah, I think that's interesting. My testing was very much not the case. I was non-scientific. I look forward to somebody else doing the hard work.
[00:22:18.27]
Ned: As is tradition.
[00:22:22.21]
Chris: But yeah. If you ask a vague question, you'll get a neutral enough answer. I would still question the veracity of the facts, but I would say that about every single LLM out there.
[00:22:33.22]
Ned: Indeed, yeah.
[00:22:34.28]
Chris: It's when you ask a pointed, direct, obvious, controversial China-based question like, What is the historical background of a famous photograph of a man with a briefcase standing in front of a line of tanks? It's going to go ahead and slam the door right in your face.
[00:22:54.25]
Ned: Yeah, not surprising.
[00:22:57.28]
Chris: Among, last point on this, I've heard stories that it will answer questions like that if they're asked in like Hebrew. Like I said, trivial to bypass and jailbreak. I don't know Hebrew, so I couldn't test that one. I did try Pig Latin, though, and unfortunately, it picked up on that.
[00:23:19.21]
Ned: Did it understand that you were asking something in Pig Latin, though?
[00:23:24.04]
Chris: It did. I would have been more impressed if it shot me down in Pig Latin.
[00:23:29.27]
Ned: It would have been amazing.
[00:23:34.10]
Chris: Anyway, it is worth remembering that just about every LLM ever has been accused of censorship or bias. Hell, Wikipedia gets accused of bias by the frustratingly fact averse. But models, OpenAI, Google, Anthropic, they all have tools to keep the output non-offensive, non-dangerous, and as far away anything related to sexual activity as humanly possible. Yeah. If an LLM was a person, you wouldn't want him to be your friend. Now, to quote Eric Hartford, a proponent of censored LLM models, these alignments, as they're called, are intended to stop a public-facing AI from teaching you to how to construct a bomb or how to cook amphetamines or what a girl is.
[00:24:32.20]
Ned: What a girl wants.
[00:24:33.27]
Chris: Stop it. We'll link to Eric's article in the show notes. His article is worth a read, no matter what side of the censorship versus public interest versus blah, blah, blah, part of the argument you fall down on. Also, Eric regularly redos these publicly available models to be, quote, uncensored. If you're of the hugging face, ilk, you'll know exactly who he is already.
[00:25:09.13]
Ned: I see. Okay.
[00:25:12.20]
Chris: Now, I've said a few times that you can run this model at home, you can use the website, you can use the app, and you can. Running things at home is cool, but there are definitely caveats that you have to be aware of. First of the full-sized models of both DeepSeq R1 and of the V3 are enormous. R1 is 671 billion parameters in size, which would require something like 5 to 800 gigabytes of RAM to run. It is actually downloadable. You can do it. I would be very impressed to see your home electricity bill if you did.
[00:26:00.03]
Ned: Wow. Yeah, the biggest... The most RAM I have in any server right now is this old HP server that I bought six years ago, and it has 384 giga RAM in it. So I would not be able to look.
[00:26:17.22]
Chris: Vram or regular system RAM?
[00:26:19.20]
Ned: Just regular system RAM. In theory, I could run the model on that beast of a box and just watch my energy consumption skyrocket.
[00:26:34.07]
Chris: What you can do, however, is download smaller versions. This is true of all of the open-source models that are out there. You download the size of the model that is small enough to fit into the memory that you have available. What you end up with is the number of parameters is way smaller. The number of parameters the model can evaluate and answer questions. You'll see names like I'll make one up. Llama 3. 3: 1. 2b. The name of the model is LLaMA, 3. 3 is the version, and then it's 1. 2 billion parameters. Then that last number will vary, 1. 2 Two, three, seven, 14, 32, 70. They're common numbers for the size of the numbers that you're getting. Smaller the number, the dumber the model is, but the less resources it requires to run and not either freeze your computer to the point of unusability or just outright crash. Now, Deep Seq is different in the sense that they approach these smaller models in a different way. They do what they call distilling, their official release using a different LLM. What this means is as it makes it smaller, it checks its work with an open-source external model to get more accurate and to be more efficient in a smaller size.
[00:27:58.23]
Chris: In theory, the 14 billion bit version of Deep Seq should be better than the 14 billion bit model from LLaMA.
[00:28:08.26]
Ned: Okay.
[00:28:10.28]
Chris: I've tested this, and I do think that there's smaller models are better than the other guys, smaller models. But in general, all the models suck below 30 billion. If you don't have 32 gigs of RAM, you're not going to have a good time. Okay. You'll be able to ask it silly questions like, Tell me a fun fact about the Roman Empire. But if you go into questions like I was talking about before, the heavy duty, especially the biblical thing, and follow up with more questions to finetune, it will go off the rails and hallucinate like crazy, just like all the other models do.
[00:28:49.28]
Ned: Indeed. Okay.
[00:28:51.19]
Chris: I mean, this isn't necessarily bad. It's certainly not surprising.
[00:28:55.29]
Ned: It's expected behavior. You just need to be aware of the limitations if you want to run a smaller model locally.
[00:29:03.13]
Chris: Right. And it gets down to what is the purpose of a smaller model? When you get to this level of doing this thing, you know how to finetune it, or you learn how to finetune it, like Eric does. But you would also do it for a specific job. Anytime you see something hyperbolic on YouTube, we're running Deep Seq on a on a raspberry pie.
[00:29:25.01]
Ned: Yeah, I did see that.
[00:29:27.01]
Chris: What you're getting out of that is absolutely useless to everybody.
[00:29:31.26]
Ned: Most likely, yeah.
[00:29:34.03]
Chris: The smaller models are going to be good when they're tuned to the specific use case they're used for, and they're not expected to be general and answer questions about the fucking Roman Empire.
[00:29:44.23]
Ned: Unless that's the thing that you want to tune it for, is just a Roman Empire Trivia model. Right.
[00:29:51.13]
Chris: But unfortunately, Megalopolis came out last year, and I think we all need a break from anything with someone named Caesar in it.
[00:30:00.12]
Ned: Ouch and touche.
[00:30:04.10]
Chris: What is the end thinking on this? I think it is massively overblown for what it is.
[00:30:14.17]
Ned: I mean, that's been the case of everything AI-related for the past two years.
[00:30:19.22]
Chris: It feels like 100 years. If you ask ChatGPT, it might actually be.
[00:30:27.03]
Ned: Fair.
[00:30:27.26]
Chris: I think this is evolutionary rather than revolutionary. That's not a bad thing. I'm not trying to say that like it's a shot. It's just the reality of the situation. Honestly, I don't think that the Deep Seek team would necessarily disagree. They're not the ones that are writing the insane headlines. It is slow in terms of the answers that it gives you. But it is also unambiguously better when you take into consideration the reasoning part and don't ask it about Tienemann Square.
[00:31:00.00]
Ned: Yeah, naturally.
[00:31:01.01]
Chris: Or Taiwan. I could go on. I think it's interesting to note whenever you're doing something, the first invention of that thing is in most ways the hardest part. Anything after that is always an evolutionary change. But it's the same industry. It's the same product. The The Wright brothers flew the first plane, and the first flight went something like 800 yards. It took all of human history to get there.
[00:31:40.19]
Ned: Right.
[00:31:41.14]
Chris: Two years later, they were going 40 miles, and 20 years later, we had planes going across the Atlantic.
[00:31:48.22]
Ned: That's a bit of a change.
[00:31:52.01]
Chris: Evolution changes. But none of that would have been possible without that first step. So it's important just to note where we are on the evolutionary, revolutionary at a time scale?
[00:32:01.06]
Ned: I think the first step proves that it can be done. All the people who are enthusiastic about whatever that endeavor is, flying, they're going to look and say, Oh, somebody actually did it. That means it's possible. They will redouble their efforts. It also attracts money, and money makes more things possible. That's how you get these leaps in technology over a very short period of time is suddenly there's a bunch of investment in it. Maybe there's a war or something that pushes technology forward. Also, the problems to solve lead to outsized improvements. Whereas as the technology becomes more mature and the improvements are far more incremental and slow, you get this where flying is now, where we don't have these amazing changes in what we're capable of when it comes to flight. I mean, in some ways, we've gone backwards. We had ultrasonic or whatever that is. Supersonic. Supersonic flight for consumers for many years, and we rolled that back because it was ridiculously expensive and inefficient, but we did have that for a while.
[00:33:20.15]
Chris: That was a problem, too.
[00:33:21.28]
Ned: Lots of problems with the Concord. But I think what's interesting is when it comes to AI, we have We haven't even reached solving the simple problems yet. We just proved that it's possible five minutes ago, and now we're at that point where everybody is dumping tons of money into it, and maybe there'll be a war, who knows? Sigh. But it's going to drive the technology forward, and the problems that we solve are going to have these outsized implications until it becomes a mature technology, which is 20 to 30 years down the line.
[00:34:00.06]
Chris: Now, I do think that we need to actually take some time and recognize that this is still an improvement, an enhancement. It is a significant technical achievement that this team did. They did it with equipment that was not that great, relatively speaking, especially relative to what American competition has. The Fastfire 2 had 10,000 A100s. Twitter's current modeling system or a supercomputer or whatever they're calling it, has 100,000 brand new Nvidia GPUs, and their AI is trash.
[00:34:45.03]
Ned: I refuse to use it, so I'll have to take your word.
[00:34:49.27]
Chris: I honestly think that it comes down to the ownership in a couple of different ways. Number one, Deep Seek is a small team. It's like three officers and a couple of engineers, and that's it. They have the luxury of being a lot more nimble, moving a lot faster, being much more singular-minded about what they were doing. They're also Subject-matter experts with vast experience and qualifications. The leaders in the industry in America are not that. Sam Altman is not an engineer. Elon Musk Deepseq is not an engineer. No. They are business douchebags. That's their superpower. They are in charge of companies with thousands of employees. Now, you can make the argument about OpenAI being its own independent team, and we don't know actually how tightly that connection to Microsoft. I still think that it's a vast corporate monolith where a team like Deep Seek is small, almost like and can do a lot more with a lot less. They also have incentive to work smarter and cheaper because they don't have a choice.
[00:36:11.04]
Ned: Yeah, that's a big point.
[00:36:14.06]
Chris: The American companies can throw this untold amount of money at it, and people like Andreessen are pissing their pants with all the money that they're throwing at these problems because they would rather not innovate. It's easier to just be annoying about cash rather than creating create new ideas. I think that's been part of the problem with AI in America at all over the past couple of generations of all these major models. Now, would I trust Deep Seek with anything serious? I mean, in general, no. I think that there are significant problems with anything that comes out of China and runs on Chinese servers. The simple answer to that from a privacy perspective is the Chinese government demands It becomes a business that understands access to all data that goes into any service that is owned and run by a Chinese company. Full stop. End of conversation. Would I run it as a model locally? I might. But like I said, the amount of compute that it takes to get the best performance out of it, it's not something you can have in your kitchen. These smaller models are useful. They are. The ones that you can run at home are useful.
[00:37:29.13]
Chris: But at a minimum, you're going to need a 4090, maybe two, in order to run the larger models, the 70 billion parameter models that will give you something equivalent to what you would get with the online services. And incidental Finally, the online model had a huge data breach like a week ago.
[00:37:50.11]
Ned: Sure did.
[00:37:51.22]
Chris: Where they went ahead and left a clear text database just out on the internet. People could just download it with user information and copies of questions and prompts. It's not a great look.
[00:38:05.12]
Ned: Now, the deep seek team might want to add one more person to their team, like somebody who knows anything about security.
[00:38:15.01]
Chris: Yeah, it's a big step forward for the technology. The last thing is, all of the stuff about how they changed the model and how they worked with it, one of the consequences of that is the deep seek model is ridiculously cheap compared to the American models. One of the biggest reasons for that is the American models have to find some way to recoup the amount of money that they flushed down the toilet over the past couple of years. The API requests from Deep Seq are something like 96% cheaper than they are from OpenAI. That's insanity. Openai, as I will remind you, still doesn't make a profit.
[00:38:55.12]
Ned: And is not projected to make a profit for at least another three years.
[00:38:59.20]
Chris: And And still Altman is out there saying that he needs another trillion dollars to make it a reality. That, I think, is the thing that is alarming. That is the thing that is freaking out Silicon Valley.
[00:39:13.10]
Ned: And I think that's why so many of the hardware companies saw their stocks tank because everyone said, Well, wait a minute. If you can get this performance out of this hardware, why am I giving you all this fucking money?
[00:39:27.05]
Chris: And the correct answer is, You shouldn't. Yeah, Stop it.
[00:39:31.26]
Ned: Put them all on a diet.
[00:39:35.10]
Chris: From a technical perspective, I am pretty confident in saying that Mark Andreessen, as usual, is wrong and should stop talking. And the reason for that is this is not a Sputnik moment. Sputnik was a revolution. It was a brand new thing that scared the shit out of us because they did something we had never done. I hate using the we and they language, but that's the way that it's described. Right. This is more like a Voxxad 1 moment. Yeah, you remember that one, right? I don't have to explain.
[00:40:07.02]
Ned: Oh, sure. But you might want to explain to the listeners just so they know.
[00:40:12.01]
Chris: Voxxad 1 was when the USSR beat the US into space with a multi-person shuttle mission. That happened in October of '64 when they sent three astronauts up. They hung around in space for a while. Then they came back. Nasa didn't get to do that until Gemini in They got there. The other one got there first. Now, Mark Andreessen definitely doesn't get that reference because he's too much of a dude bro for things like nuance and historically accurate metonymy. But, again, I'm repeating myself.
[00:40:48.18]
Ned: Well, this is certainly an evolving situation, so I'm sure we're going to have to return to it at some point in the future as more information becomes available. Honestly, I'm tired by hearing about Deep I think that's it for today. But hey, thanks for listening or something. I guess you found it worthwhile enough if you made it all the way to the end. So congratulations to you, friend. You accomplished something today. Now you can go sit on the couch, download that model onto your local laptop, and watch it burn a hole in your couch. You've earned it. You can find more about the show by visiting our LinkedIn page. Just search Chaos Lever or go to our website, chaoslever. Com, where you'll find show notes, blog posts, and general Tom Foulery. We'll be back next week to see what fresh hell is upon us. Ha-ta for now.
[00:41:39.25]
Chris: Matonomyonyme is a real word.
[00:41:48.01]
Ned: Sure, Jan. Sure.