Skip to main content
3 votes
1 answer
190 views

I have the following C code that I am testing to understand perf and caching. It sequentially accesses an array of doubles. // test.c #include <stdio.h> #include <stdlib.h> #include <...
user180574's user avatar
  • 6,244
2 votes
1 answer
77 views

I have an open source C/C++ program on Linux amd64 that processes a PDF input file and that I did not write by myself. So I'm not familiar with its code. Processing a PDF file read from local disk ...
MrSnrub's user avatar
  • 1,265
1 vote
1 answer
78 views

I'm experimenting with perf record --control to profile select sections of a program. Here's a Rust program that uses perf to profile the call to a function waste_time(): use libc; use log::info; use ...
Edd Barrett's user avatar
  • 3,685
1 vote
0 answers
160 views

I’m using the PyTorch profiler to analyze sglang, and I noticed that in the CUDA timeline, some kernels show “Command Buffer Full”. This causes the cudaLaunchKernel time to become very long, as shown ...
plznobug's user avatar
  • 143
0 votes
0 answers
35 views

I'm trying to profile a Python FastAPI application (which uses LangGraph) using Scalene on Windows. Since Scalene's Windows version doesn't support multithreading, I'm running it in WSL instead. When ...
Raffa50's user avatar
  • 23
2 votes
0 answers
61 views

I had a multi-process application to profile using perf with the following command: sudo perf record -a -g -F 99 -e cycles:u -- sleep 50000 & The sleep time is over 13 hours. The program should ...
Bartłomiej Dudek's user avatar
1 vote
0 answers
85 views

This question came up while I was saving a large number of model-inferred embeddings to plain text. To do so, I needed to convert lists of float embeddings into strings, and I found this conversion to ...
K_Augus's user avatar
  • 474
1 vote
0 answers
34 views

I am using perf to profile workloads on my system, and I need to track the memory traffic generated by my workload on each NUMA node. Currently, I only have perf results for LLC cache misses, which ...
smz's user avatar
  • 515
1 vote
1 answer
44 views

I have a json file that contains profiling data that can be opened with chrome's trace-viewer. I can do it manually by opening chrome://tracing, then selecting 'load' and then loading the json file. ...
Crumml's user avatar
  • 81
0 votes
0 answers
105 views

We tried profiling a simple MAC operation using both RISC-V Vector (RVV) intrinsics and plain C code. Surprisingly, the C version performs better, even though the intrinsics code processes 16 ...
shreyas's user avatar
0 votes
0 answers
99 views

I only started to use heaptrack and can not set filtering by modules. It possible to do from gui like this Heap track but output very nosy and this filter doesn't influence to other tabs. Does exist ...
Александр Чулгарев's user avatar
1 vote
1 answer
62 views

I have a Helidon app and would like to take CPU samples and/or start a CPU profiler. This does not work. With the same setup, it works for a simple (non Helidon) app Trying to start the CPU (and also ...
Itchy's user avatar
  • 2,464
1 vote
1 answer
31 views

I am analyzing my numpy/python code by running it with "-m cProfile". Snakeviz shows as the entry with most time spent: 20895038 calls to ufunc_api.py:173(__call__) with the majority of the ...
j13r's user avatar
  • 2,731
1 vote
0 answers
148 views

On Linux, I often find myself perusing perf stat to figure out whether a code change improved things like cache miss rate. (I'm specifically interested in cache miss rates and page faults.) Now I'm ...
Marcus Müller's user avatar
1 vote
1 answer
170 views

I'm developing a Flutter application that doesn't utilize emojis in any part of the UI or logic. However, upon profiling the app using Android Studio's Memory Profiler, I observed that androidx.emoji2....
Filip Golovic's user avatar

15 30 50 per page
1
2 3 4 5
405