Insider Q&A: Pentagon AI chief on network-centric warfare, and the challenges of generative AI
Craig Martell, head of the Pentagon’s Digital and Artificial Intelligence Division, is concerned that generative AI systems like ChatGPT could deceive and plant disinformation. His talk about the technology at the DefCon hacker conference in August was a huge success. But it’s not bad at all in terms of reliable AI.
Martell was not a soldier, but a data scientist, having headed machine learning at companies like LinkedIn, Dropbox, and Lyft before taking the position last year.
Organizing US military data and determining which AI is trustworthy enough to take into battle is a major challenge in an increasingly unstable world where multiple nations are racing to develop lethal autonomous weapons.
The interview has been edited for length and clarity.
—-
Q: What is your main mission?
A: Our mission is to extend decision-making advantage from the boardroom to the battlefield. I don’t see it as our job to deal with a few specific tasks, but rather to develop tools, processes, infrastructure, and policies that allow the department as a whole to scale.
Question: So the goal is global information domination? What do you need to achieve success?
A: We’ve finally arrived at network-centric warfare — how to get the right data to the right place at the right time. There is a hierarchy of needs: high-quality data at the bottom, analytics and metrics in the middle, and AI at the top. For this to work, what’s most important is high-quality data.
Q: How should we think about using AI in military applications?
A: AI is really all about calculating the past to predict the future. I actually don’t think the modern wave of AI is any different.
Q: Pentagon planners say the Chinese threat makes developing artificial intelligence urgent. Is China winning the AI arms race?
A: I find this metaphor to be somewhat flawed. When we had a nuclear arms race, it was with homogeneous technology. Artificial intelligence is not. Nor is it Pandora’s box. It is a set of techniques that we apply on a case-by-case basis, empirically verifying whether they are effective or not.
Q: The US military is using artificial intelligence technology to help Ukraine. How are you helping?
A: Our team is working with Ukraine only to help build a database of how allies can help. It’s called Skyblue. We’re just helping make sure you stay organized.
Question: There is a lot of discussion about lethal autonomous weapons, such as attack drones. There is a consensus that humans will eventually be reduced to a supervisory role, where they will be able to cancel tasks but mostly not intervene. Sounds right?
A: In the military, we train using technology until we gain justified confidence. We understand the limitations of the system, and we know when it works and when it may not work. How are autonomous systems mapped? Take my car. I trust the adaptive cruise control on it. On the other hand, the technology that is supposed to prevent her from changing paths is terrible. So I have no justified confidence in this system and do not use it. Extrapolate that to the military.
Question: The Air Force’s “Dedicated Pilot” program is under development that would see drones fly alongside human-piloted fighter jets. Is computer vision good enough to distinguish friend from foe?
A: Computer vision has made amazing strides in the past 10 years. Whether that is useful in a given situation is an empirical question. We need to determine what accuracy we are willing to accept for a use case and build on those criteria – and test. So we cannot generalize. I would really like us to stop talking about technology as a monolith and instead talk about the capabilities we want.
Q: You are currently studying generative artificial intelligence and large language models. When can it be used in the Ministry of Defense?
A: Certainly large-scale business models are not constrained to tell the truth, so I am skeptical. However, through the Lima Task Force (launched in August) we are studying over 160 use cases. We want to decide what is low risk and safe. I’m not making an official policy here, but let’s assume. Low stakes could be something like creating first drafts of writing or computer code. In such cases, humans will do the editing, or in the case of software, the translation. This technology can also work on information retrieval, where facts can be verified to ensure they are correct.
Q: A big challenge for AI is recruiting and retaining the talent needed to test and evaluate systems and classify data. AI data scientists earn much more than what the Pentagon traditionally pays. How big is this problem?
A: This is a huge can of worms. We’ve just created a digital talent management office and are thinking carefully about how to fill a whole new set of job roles. For example, do we really need to hire people who are looking to stay at the Department of Defense for 20-30 years? Probably not. But what if we could get it for three or four? What if we paid for their college and they paid us back within three or four years and then they took off and got a job in Silicon Valley? We think creatively like this. Can we, for example, be part of the diversity pipeline? Do you want to recruit at HBCUs (Historically Black Colleges and Universities)?