-->

Enterprise AI World 2024 Is Nov. 20-21 in Washington, DC. Register now for $100 off!

The Rise of GenAI and LLMs

Article Featured Image

In 1950, Alan Turing suggested a test to determine if computers could mimic human intelligence well enough that an impartial observer could no longer tell the difference. We are still talking about the Turing Test almost 75 years after its inception.

Early in the era of computing, researchers looked to computers not to mimic human intelligence but to study it. In 1958, Cornell University’s Frank Rosenblatt published an article in Psychological Review titled, “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain.” This paper forms the foundation of LLMs and other neural network technologies. Rather than building models of cognition based on qualitative theory, which Rosenblatt thought fell prey to circular thinking in their proofs, he sought to create simulations of the human brain based on mechanical structures that could provide verifiable, reproducible results.

With the eye as the most accessible extension of the human mind, Rosenblatt modeled his perceptrons on human vision—the foundational ideas associated with today’s LLMs, including signals, feedback, and layers. Rosenblatt implemented his model on an IBM 704, a nearly 10-ton computer that eventually learned to distinguish punch cards marked on the right from those marked on the left.

Although Rosenblatt did not live in a time of internet media, the idea of a computer that could learn appeared in popular media, including a 1958 article with this headline in The New York Times: “NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser.” This topic was also covered in The New Yorker. Mathematicians and AI pioneers Marvin Minsky and Seymour A. Papert challenged Rosenblatt’s work by proving that single-layer perceptrons could not easily solve the XOR problem. Their 1969 book, Perceptrons: An Introduction to Computational Geometry, published by MIT Press, precipitated a critical storm that resulted in the first AI winter. (A reissue of the 1988 expanded edition of the book with a new foreword by Leon Bottou was published in 2017.)

Unfortunately, Rosenblatt’s writing on three-layer perceptrons and other findings from earlier in his career could not offset the controversial assertions by Minsky and Papert. Linear approaches to computing popular at the time of Rosenblatt’s work, his academic humbleness, and his death in 1971 delayed the acceptance of his ideas.

After a decade of little progress in which AI accomplishments included word-for-word translations of Russian into English and a program that could play checkers, along with continued research on perceptrons, the United States Defense Advanced Projects Research Agency (DARPA) withdrew AI funding. After critical findings on the progress of AI research in the so-called Lighthill Report, written by Cambridge University applied mathematics professor James Lighthill (chilton-computing.org.uk/inf/literature/reports/lighthill_report/p001.htm), the U.K. also stopped funding AI research.

The first AI winter lasted from 1974 to 1980. Renewed interest in AI based on logic fueled an AI renaissance in the 1980s. The fragility and cost of developing “expert systems” tumbled the hopes of researchers in this new era, leading to a second AI winter that ran from 1987 to 1994. It would be almost 50 years before Rosenblatt’s ideas resurfaced to influence LLM technology.

TALKING TO COMPUTERS

While Rosenblatt taught computers to perceive, another early computer scientist taught them to converse. At the time, the two streams of research had nothing to do with each other. In 1966, Joseph Wizenbaum developed the first chatbot, Eliza, named after Eliza Doolittle, George Bernard Shaw’s lead character in Pygmalion. Eliza was written in an early symbolic processing language called SLIP. Eliza was designed to mimic a Rogerian psychotherapist in its DOCTOR script implementation.

The chatbot used simple symbolic strings based on user input to appear as though it was thoughtfully listening to a subject. It was not. Some predetermined responses incorporated user input, while others suggested a deeper probing of the subject’s interests.

Eliza was not an LLM. It did not surface any behaviors on its own. Its code, whether written in SLIP, LISP, or Basic, is fully trackable and transparent. While subjects may feel surprised and intrigued, computer scientists have no issues understanding how Eliza works. Cognitive psychologists, however, continue to marvel at how a simple program effectively engages those who use it.

While Eliza’s underlying technology does not influence today’s technology, its interaction model has become the primary way LLMs interact with users. It isn’t enough for computers to gain access to vast knowledge stores. In order to extract value, people need to talk with them about what they know.

THE RISE OF LLMS

Following the second AI winter, much of AI research focused on mathematical, statistical, and probability approaches, including neural networks. These systems found early adoption in automation, becoming the basis for many AI features we know today, such as image identification, automated photo enhancements, and managing computer performance.

In the 1980s, IBM started work on systems that could predict the next word in a sentence. It created a dictionary of word occurrence based on the text used to train its models. It wasn’t until the late 1980s that computer processing technology started to meet the demands of more sophisticated models. In 1991, the World Wide Web became the target for publishing content, and, with it, the data required to train models became increasingly, even overwhelmingly, available.

More data required new approaches to understanding language. The essential pattern recognition of IBM’s text predictor would not meet the goals for more sophisticated human - machine interactions. Neural networks had to grow up.

Deep learning created pretrained models based on massive amounts of data. Simple neural networks gave way to multiple layers derived from Rosenblatt’s later work.

The path to today’s LLMs required several innovations that built upon successes and sought to overcome the deficiencies of previous approaches. While the AI winter was thawing, it wasn’t until 2020 that the investment and adoption of AI could be characterized as an AI spring.

EAIWorld Cover
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues