LLaMa, Alpaca Amiga AI?

redblade · 17 May 2023, 01:29

https://crfm.stanford.edu/2023/03/13/alpaca.html
From the website

Quote:

We introduce Alpaca 7B, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. On our preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$).

https://crfm.stanford.edu/alpaca/
https://github.com/tatsu-lab/stanford_alpaca

It's a small ChatGPT type LLM (learned language model) that can run on smaller hardware which you can train for specialist (niche) areas.

Of course one of the problems with Amiga Assembler is that you have the non commodore guide lines (Demo coding/hardware bashing) from the late 80s, which might make it a bit more difficult to train the AI unless it is used to fix the code to make it more OS friendly?

My laptop only has 4GB of RAM and I can't upgrade the RAM

(It has a broken screen).
Has any one used it? Would it be a good idea to feed in the source code from aminet? 68k Assembler, AmigaE, ARexx, C89,

Does any one have much experience with this? I have to reinstall Lubuntu and I'm going to resize my swap space to 16GB and try to run the smaller model (Not the 65GB) one.

TEG · 17 May 2023, 12:11

The problem, I presume, is you have to gather the input data which are all over the place and have some code to extract it.

Locutus · 17 May 2023, 12:19

From the readme:

Quote:

Below is a command that fine-tunes LLaMA-7B with our dataset on a machine with 4 A100 80G GPUs in FSDP full_shard mode.

Doesn't matter you only have 4GB locally, that's a >10k Card and they used 4 for training. In the AI world that is 'small'.

If you want to train this with your own dataset, you wouldn't do this locally but provision AWS EC2 instances with attached A100 Accelerator cards. That's how they come to the 600$ price tag.

TEG · 17 May 2023, 13:09

You mean A1000 Accelerator cards of course ;-)

redblade · 17 May 2023, 23:51

Quote:

Originally Posted by TEG

The problem, I presume, is you have to gather the input data which are all over the place and have some code to extract it.

Was just going to rip code from Aminet, or the code that came on disks that were supplied with programming books (Abacus, Paul Overaa)

Quote:

Originally Posted by Locutus

From the readme:
Doesn't matter you only have 4GB locally, that's a >10k Card and they used 4 for training. In the AI world that is 'small'.

If you want to train this with your own dataset, you wouldn't do this locally but provision AWS EC2 instances with attached A100 Accelerator cards. That's how they come to the 600$ price tag.

Damn thought it was too good to be true, was reading that some one managed to get a version up and running on a Rasp pi 4, model.

Shows you how little I know about hardware, maybe in 5 years with Moores law

Locutus · 18 May 2023, 07:28

There is the difference between 'running' the model, which is something you can do on client hardware (a Raspberry for in your example).

But before you can run the model it needs to be 'trained' which requires an order of magnitude more compute power and memory, and that's whats being done with the Nvidia A100 accelerators.

But first thing, preparing data for the generation model isn't as simple as just downloading all of Aminet, unpacking all archives and copying out every `*.c` file

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Founder of DeepMind AI was an Amiga guy?	eXeler0	Nostalgia & memories	11	17 February 2023 23:26
AGA conversions + AI	alexh	Retrogaming General Discussion	3	01 November 2022 17:31
the making of a game AI	meynaf	Coders. General	0	22 June 2010 18:55

17 May 2023, 12:11	#2
TEG Registered User Join Date: Apr 2017 Location: France Posts: 567	The problem, I presume, is you have to gather the input data which are all over the place and have some code to extract it.

17 May 2023, 13:09	#4
TEG Registered User Join Date: Apr 2017 Location: France Posts: 567	You mean A1000 Accelerator cards of course ;-)

18 May 2023, 07:28	#6
Locutus Registered User Join Date: Jul 2014 Location: Finland Posts: 1,176	There is the difference between 'running' the model, which is something you can do on client hardware (a Raspberry for in your example). But before you can run the model it needs to be 'trained' which requires an order of magnitude more compute power and memory, and that's whats being done with the Nvidia A100 accelerators. But first thing, preparing data for the generation model isn't as simple as just downloading all of Aminet, unpacking all archives and copying out every `*.c` file

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)