Computer Science ChatGPT and other Large Language Models

Bogeyman · May 13, 2024

https://twitter.com/i/web/status/1790089505984151940

https://twitter.com/i/web/status/1790111044662084043

Open AI's new model, GPT-4o, taught a student live while he solved a question.

uçuyorum · Jul 13, 2024

One concept for deployment of AI and LLM models: The large language models are as the name suggest, large. What researchers have realized with transformer architecture is that, as the models grow in size, they start to exhibit different type of human like reasoning that smaller models simply weren't able to. But as these models grow they also require lots of data and lots of computation power. Now LLM work well by training on a general data, the more they know about the world, the better they work at most tasks, much more human like than the AI we had 10 years ago. But when domain specific knowledge is limited, particularly restricted domains like military, fine tuning a model is also difficult.

Also, LLM are not good at maths and numerical data, they perform well with text data and manual labels, they require extra tools for chatgpt to be able to use math( model can write the equation and solve it with external calculator). And so an AI used for various such purposes must also be trained to be able to use those external tools ( via special language tokens). This requires military data to be transformed into a format useable by LLM to leverage them.

Now getting to the crux of the issue. Being able to use these models in real time is a challenge of network communication. You can't have an instance of LLM running on every vehicle, nor would it be as effective. However having a single centralized instance makes bandwidth and security / ew resistance an issue. So the concept should be, basically command and control kind of vehicles, being accompanied by trailers that have servers running instances of LLM and communications equipment, to be able to communicate with assets in the area, and local communication will be much faster as well. This still requires some of the data to be preprocessed in place, however. So any sensor equipped system should be able process and maybe tokenize its data by itself to be shared to these models and compress them. This requires thinking with lessons from Big Data and IoT fields, that is, thinking the system as a whole, having asymmetrical distributed architecture, and trying to minimize data transfer overheads and keep simplicity of data processing commands. Also large assets like ships should have datacenters designed around future use of such advanced AI.

Computer Science ChatGPT and other Large Language Models

Bogeyman

Experienced member

uçuyorum

Contributor

Follow us on social media

Latest posts

About us

Newest members