Lajamerr_Mittesdine

Lajamerr_Mittesdine t1_j1fuu1j wrote

It's 2022. Everyone should be having their own email address dumps with their domain name.

For example with Google Domains you can easily spin up 100 email addresses forwarded to your main mailbox no extra charge. Comes with your yearly domain renewal.

I create emails for each service I use.

reddit@mydomain.com , google@mydomain.com , walmart@mydomain.com

I can just create an email called junktest@mydomain.com

And if it ever gets to spammy you can just delete that email from the list and it won't get forwarded to your main inbox

0

Lajamerr_Mittesdine OP t1_itomfs6 wrote

CoT simply breaks down a problem into multiple interconnected solution statements to arrive at one conclusive answer.

You can prompt a CoT Model to go down different reasoning structures and arrive at different answers(but sometimes wrong) but those are all independent from one another.

Note that this is fine-tuning an existing LLM.

This fine-tuning is in part done by a hypermodel that helps rank solutions. These solutions are then used to fine-tune the model even further to become better reasoners using its own generated answers.

So the model uses its own understandings to generate CoT solution statements. The hypermodel would rank those statements and then the existing model can be fine-tuned on the newly generated positive and negative solutions reinforcing the idea of what correct solution statements look like and what negative ones look like as well.

Future work: So what is limiting the LLM model from eventually getting to 100%~ ? The bottleneck from preventing this going exponential is the hypermodel that can accurately rank the solution. Theoretically if you had a perfect ranker blackbox you could eventually get to 100%~. So what you would want in future work is either just a more accurate ranker overall or someway to continuously improve the ranker hypermodel in an unsupervised fashion just like we have this hypermodel for the LLM.

Personal Opinion: So what this really is doing is just solving some low hanging fruit in prompting the LLM in reasonings it already understands in different contexts and more finely puts them as the highest ranking solutions across a broader range. It's not learning new concepts entirely.

10

Lajamerr_Mittesdine t1_isa34ib wrote

All the answers are incomplete because they don't provide the assumptions necessary to arrive at a complete solution.

A more complete answer would look like this.

>Assuming just gravitational forces both the lighter and heavier baseballs both would fall at the same rate and then reach the surface at approximately the same time. This can be impacted however by additional forces that may be present such as an atmosphere providing additional resistances based on the surface area, density, and total mass of each object.

Though even that is an incomplete answer.

1

Lajamerr_Mittesdine t1_is89dnx wrote

I have a project idea and would like some feedback on feasibility.

I want to create a ML model that I would use in a subsequent model training loop.

This first model would take a image of x by x dimensions as input and then output instructions to a custom Image Creation tool for steps of re-creating the image.

The instructions would be semi-human readable but mostly just for the program to interpret and would look like the following and be arguments for the custom image creation tool to take in.

> >412, 123 #FF00FF ----- This would turn this one pixel Fuschia > >130, 350 ; 150, 400 #000000 ----- This would turn this rectangle of pixels on the canvas to black.

And many more complex tools available to take in as arguments.

The reward function would have two stages. The first stage is how close is your image to the original which would be easy to compute. And the second stage reward function would reward instruction minimization. I.E. 5000 steps to recreate the image would be rewarded higher than 10000 steps.

It would also be easy to set the upper bound of recreating the image to the total pixel count for that image so that it can be killed if it reaches the limit without creating the 1:1 image it was given as input.

The program would also allow as input argument the ability to create custom functions. Which we would also the model the ability to do. One thing that would incentivize the model to create and use its custom functions is that the reward would be tweaked so that if the model uses a predefined function it creates it counts as less instructions than if it were to individually call those instructions.

This first model is all about training it to recreate images 1:1 in the least amount of discrete instructions as possible for any arbitrary image.

This model/program would then be used in a second models training loop which I would like to keep secret for now.

1