Highest Voted Questions - Artificial Intelligence Stack Exchange

5

votes

2 answers

What type of neural network would be most feasible for playing a realtime game?

For implementing a neural network algorithm that can play air hockey, I had two ideas for input, and I'm trying to figure out which design would be most viable. The output must be two analog values that dictate the best position on half of the table…

asked Sep 28 '17 at 17:11

Patrick Roberts

153
5

5

votes

2 answers

In a neural network given partial inputs and complete outputs, is it possible to predict remainig inputs

In example, if there is a simple feed-forward neural network with 3 input neurons, 3 hidden neurons, and one output neuron; is it possible to predict a the value of an input neuron given the values and weights for the other two inputs and the…

neural-networks

asked Sep 27 '17 at 20:18

Beryllium

217
1
3

5

votes

3 answers

How do open source LLMs compare to GPT-4?

I have heard some back and forth regarding open source LLMs like Llama. I have heard that on certain benchmarks they perform close, the same or better than GPT-4, but caveats that they tend to lack the diversity and range of GPT-4, and also fail to…

asked Jul 09 '23 at 08:54

Julius Hamilton

225
1
10

5

votes

1 answer

Who invented DAN?

DAN was a prompt that went through many, many iterations during the initial months of ChatGPT’s release to the public. DAN is an acronym which stood for “Do Anything Now”, and was a prompt specifically designed to circumvent the grid lines OpenAI…

asked Jul 08 '23 at 11:24

Julius Hamilton

225
1
10

5

votes

3 answers

For an LLM model, how can I estimate its memory requirements based on storage usage?

It is easy to see the amount of disk space consumed by an LLM model (downloaded from huggingface, for instance). Just go in the relevant directory and check the file sizes. How can I estimate the amount of GPU RAM required to run the model? For…

asked Jun 14 '23 at 12:01

ahron

181
1
6

5

votes

1 answer

Detect patterns in sequences of actions

I have to analyse sequences of actions that look more or less like this JSON blob. The question I'm trying to answer is whether there are recurring (sub)patterns that different users adopt when asked to perform a certain specific task -- in this…

asked Sep 20 '17 at 12:37

Morpheu5

101
4

5

votes

3 answers

Can you confirm that the transformer works strictly deterministically and there is no randomness inside or between the attention layers?

On a high-level temperature and randomness affect the output of a generative language model: Lower temperature: Produces more focused, conservative, and consistent responses. Moderate temperature: Strikes a balance between creativity and…

asked Jun 07 '23 at 10:53

Hans-Peter Stricker

891
1
8
21

5

votes

1 answer

Why is AI safety so much harder than Isaac Asimov's "Three Laws of Robotics"?

I understand that AI researchers are trying to create AI designs that allow for desired behavior without undesirable side-effects. A classic example of an attempt is Isaac Asimov's Three Laws of Robotics. This idea seems to have been debunked due to…

asked Sep 16 '17 at 23:12

N00b101

191
1
5

5

votes

2 answers

Machine learning with graph as input and output

In my application, I have inputs and outputs that could be represented as graphs. I have a number of acceptable pairs of input and output graphs. I want to use these to train a model. I am looking for pointers where simple examples of learning…

asked Sep 15 '17 at 07:23

Suresh

159
6

5

votes

2 answers

What is curriculum learning in reinforcement learning?

I recently came across the term "curriculum learning" in the context of DRL and was intrigued by its potential to improve the learning process. As such, what is curriculum learning? And how can it be helpful for the convergence of RL algorithms?

asked Apr 29 '23 at 14:38

Robin van Hoorn

2,366
1
10
33

5

votes

2 answers

How is the next token predicted in transformers?

In the transformer (or GPT/decoder only), at the end of the decoder blocks but before the final linear layer you have X vectors (for the X tokens at the input of the decoder). We then want to compute the probabilities for the next token of the…

asked Apr 21 '23 at 00:48

Miguel Carvalho

51
1
2

5

votes

2 answers

What makes the approximation capabilities of neural networks different than something like, say, Fourier series?

People often cite the universal approximation theorem as a reason for why neutral networks are so effective at capturing patterns or features of various training data. However, this seems unremarkable to me, because something like Fourier series are…

asked Mar 23 '23 at 00:48

Maximal Ideal

153
4

5

votes

2 answers

Constructing a dataset that scores well only for a specific set of hyper parameter values

When designing a machine-learning system, there are various parameters that have to be determined. I am interested in the following general question: is it possible to construct a dataset on which the system will have good performance with some…

asked Sep 05 '17 at 05:20

Erel Segal-Halevi

285
1
5

5

votes

1 answer

What would an implementation of this Neural Network look like?

I'm relatively new to neural networks and was wondering what an implementation of this paper would look like. More specifically, how are the correct values of Kp, Ki, and Kd determined at run time so it can be back propagated?

asked Sep 04 '17 at 20:05

Beryllium

217
1
3

5

votes

3 answers

How does neural network classifier classify from just drawing a decision plane?

I understand that a neural network basically distorts(non-linear transformation) and changes the perspective(linear transformations) of input space to draw a plane to classify data. How does the network deduce if an input is one side of a plane and…

asked Sep 03 '17 at 20:13

Daniel

326
2
9

Most Popular