Most Popular

1500 questions
5
votes
2 answers

What type of neural network would be most feasible for playing a realtime game?

For implementing a neural network algorithm that can play air hockey, I had two ideas for input, and I'm trying to figure out which design would be most viable. The output must be two analog values that dictate the best position on half of the table…
5
votes
2 answers

In a neural network given partial inputs and complete outputs, is it possible to predict remainig inputs

In example, if there is a simple feed-forward neural network with 3 input neurons, 3 hidden neurons, and one output neuron; is it possible to predict a the value of an input neuron given the values and weights for the other two inputs and the…
Beryllium
  • 217
  • 1
  • 3
5
votes
3 answers

How do open source LLMs compare to GPT-4?

I have heard some back and forth regarding open source LLMs like Llama. I have heard that on certain benchmarks they perform close, the same or better than GPT-4, but caveats that they tend to lack the diversity and range of GPT-4, and also fail to…
Julius Hamilton
  • 225
  • 1
  • 10
5
votes
1 answer

Who invented DAN?

DAN was a prompt that went through many, many iterations during the initial months of ChatGPT’s release to the public. DAN is an acronym which stood for “Do Anything Now”, and was a prompt specifically designed to circumvent the grid lines OpenAI…
Julius Hamilton
  • 225
  • 1
  • 10
5
votes
3 answers

For an LLM model, how can I estimate its memory requirements based on storage usage?

It is easy to see the amount of disk space consumed by an LLM model (downloaded from huggingface, for instance). Just go in the relevant directory and check the file sizes. How can I estimate the amount of GPU RAM required to run the model? For…
ahron
  • 181
  • 1
  • 6
5
votes
1 answer

Detect patterns in sequences of actions

I have to analyse sequences of actions that look more or less like this JSON blob. The question I'm trying to answer is whether there are recurring (sub)patterns that different users adopt when asked to perform a certain specific task -- in this…
Morpheu5
  • 101
  • 4
5
votes
3 answers

Can you confirm that the transformer works strictly deterministically and there is no randomness inside or between the attention layers?

On a high-level temperature and randomness affect the output of a generative language model: Lower temperature: Produces more focused, conservative, and consistent responses. Moderate temperature: Strikes a balance between creativity and…
Hans-Peter Stricker
  • 891
  • 1
  • 8
  • 21
5
votes
1 answer

Why is AI safety so much harder than Isaac Asimov's "Three Laws of Robotics"?

I understand that AI researchers are trying to create AI designs that allow for desired behavior without undesirable side-effects. A classic example of an attempt is Isaac Asimov's Three Laws of Robotics. This idea seems to have been debunked due to…
N00b101
  • 191
  • 1
  • 5
5
votes
2 answers

Machine learning with graph as input and output

In my application, I have inputs and outputs that could be represented as graphs. I have a number of acceptable pairs of input and output graphs. I want to use these to train a model. I am looking for pointers where simple examples of learning…
Suresh
  • 159
  • 6
5
votes
2 answers

What is curriculum learning in reinforcement learning?

I recently came across the term "curriculum learning" in the context of DRL and was intrigued by its potential to improve the learning process. As such, what is curriculum learning? And how can it be helpful for the convergence of RL algorithms?
Robin van Hoorn
  • 2,366
  • 1
  • 10
  • 33
5
votes
2 answers

How is the next token predicted in transformers?

In the transformer (or GPT/decoder only), at the end of the decoder blocks but before the final linear layer you have X vectors (for the X tokens at the input of the decoder). We then want to compute the probabilities for the next token of the…
5
votes
2 answers

What makes the approximation capabilities of neural networks different than something like, say, Fourier series?

People often cite the universal approximation theorem as a reason for why neutral networks are so effective at capturing patterns or features of various training data. However, this seems unremarkable to me, because something like Fourier series are…
5
votes
2 answers

Constructing a dataset that scores well only for a specific set of hyper parameter values

When designing a machine-learning system, there are various parameters that have to be determined. I am interested in the following general question: is it possible to construct a dataset on which the system will have good performance with some…
5
votes
1 answer

What would an implementation of this Neural Network look like?

I'm relatively new to neural networks and was wondering what an implementation of this paper would look like. More specifically, how are the correct values of Kp, Ki, and Kd determined at run time so it can be back propagated?
Beryllium
  • 217
  • 1
  • 3
5
votes
3 answers

How does neural network classifier classify from just drawing a decision plane?

I understand that a neural network basically distorts(non-linear transformation) and changes the perspective(linear transformations) of input space to draw a plane to classify data. How does the network deduce if an input is one side of a plane and…
Daniel
  • 326
  • 2
  • 9