Questions tagged [artificial-intelligence]
Questions about design and properties of agents that act in a dynamic environment and make decisions towards some goal without user control.
571 questions
0 votes
0 answers
25 views
Understanding reduce operations in PyTorch and autodiff. Confused on Operation tracking
I am trying to understand how the Reduce Operation that PyTorch does in its backward pass for broadcasted tensors actually work under the hood. I am trying to make a cpp library for neural networks ...
0 votes
0 answers
30 views
When predicting density maps with CNNs: is using MSE more appropiate than pixelwise sigmoid activation + binary cross entropy?
I'm building a U-Net for predicting density maps. The ground truth maps are generated by labeling centroids in the objects of interest in the original image (they are all of the same class), forming a ...
0 votes
0 answers
64 views
Can prompt injection be used to circumvent the intended use of LLM's?
Let us say that in the not-so-distant future, people might have to convince an LLM that they are worthy of a job. Is it possible to use prompt injection to convince the LLM that you are worthy of a ...
-3 votes
1 answer
93 views
Algorithm Discovery Using ChatGPT O3-mini-high?
I gave this prompt to ChatGPT o3-mini-high: Give me a novel algorithm not discovered yet that reduces the complexity of matrix multiplication. Do you think this algorithm is correct and truly novel? ...
0 votes
0 answers
34 views
Training A Convolutional Neural Network
So I am building a CNN from scratch. I built all the layers with the first layer of convolution->pooling, the second layer of convolution->pooling, flatten, feed into a deep network (built that ...
0 votes
1 answer
112 views
I need help in designing a genetic algorithm for matchmaking in ecommerce
I would like to implement a genetic algorithm to solve the matchmaking problem between offers and demands in a marketplace. I found a research paper which proposes the following encoding: each ...
1 vote
1 answer
87 views
How does the RETE algorithm for expert production systems work?
I struggle with the understanding of this algorithm. Is there anybody willing to explain me how the tree is built in RETE and how it helps with concrete inputs?
3 votes
2 answers
215 views
Is there any way to program a chess bot which never loses?
Even if it may be complicated, is it possible with the present technology?
1 vote
0 answers
115 views
Simplest way to incorporate edge types into self-attention (in a graph transformer)?
In the GT (Graph Transformer) model presented by Dwivedi & Bresson in "A Generalization of Transformer Networks to Graphs", the following equations are used to update node and edge ...
-1 votes
1 answer
80 views
Forming Sentences
I have a bunch of transcripts from online videos but I suspect the transcripts aren't formatted well. There are no punctuations, sentences are broken abruptly, some words aren't complete (for example ...
1 vote
2 answers
256 views
How much do llms know?
Can we assume that a 7gb llm model knows 7gb of text info? (by 7gb i mean a 7 billion parameter model at Q8 quantization) or a (1.75 billion parameter model at full fp32 precision) For example, a 70,...
2 votes
2 answers
137 views
Gödel's theorem and machines' power
I was studying AI and when a question came to my mind. I know that one of the objections to the possibility of a thinking machine examined by Turing is the so called mathematical objection, ...
1 vote
1 answer
95 views
linear relationship between the log-odds and the features
In this post I asked about why the sigmoid/softmax function was used in classification: Binary Classification- Non-Differentiable Loss Function But I have a followup question: We're assuming that the ...
2 votes
3 answers
3k views
TOPS trillion operations per second to Tokens per second
A lot of AI hardware coming out lately has its performance mentioned in TOPS i.e trillion operations per second. Does anyone have an Idea how to estimate the llm performance on such hardware in tokens ...
1 vote
1 answer
72 views
Binary Classification- Non-Differentiable Loss Function
For binary classification using linear regression, we pass the output z of the linear regression through the sigmoid function so that if the linear regression takes an input x which should be ...