User Details
- User Since
- Oct 3 2014, 5:57 AM (601 w, 2 d)
- Availability
- Available
- LDAP User
- Giuseppe Lavagetto
- MediaWiki User
- GLavagetto (WMF) [ Global Accounts ]
Yesterday
Fri, Apr 10
I will add a couple of points here:
Wed, Apr 8
FWIW, I think this introduces both a discrepancy with the logic we adopt at the edge, and a matter of inequality: bots who control a large enough IP space would have a much larger limit than the bot of a community member, which is typically coming from a single source IP.
Fri, Apr 3
Wed, Mar 25
If you want to expose the HTTP API for users, that would happen via the component the linked artifact service would expose. I don't understand what the issue is here.
The way I see it this was a case of "failure in depth".
Mon, Mar 23
Just my 2 cents about various objections I've seen raised in this task:
- If anything, we should've started moving to gRPC for internal service to service communications a long time ago; we haven't done it mostly because we didn't have immediate needs.
- Our mesh is not built "around HTTP" more than it's built around gRPC. It can work pefectly well with grpc. In fact, quite a few internal functions of our service mesh are built using grpc (for the same reason it would make sense using grpc here)
- Given gRPC uses HTTP/2 for transport, and it works perfectly fine with our mesh, our ingress, and our routing logic.
- I'm not sure I understand the comment about needing an http replica, but frankly, what stops you from running your "service" (which is just a lambda) with a sidecar of the linked artifact cache to provide the http interface?
Mon, Mar 16
Mar 2 2026
For now, let's remove the individual headers at the edge; we will have time to come up with a prefix naming (if any) that we want to use going forward, and to convert the headers to those values.
Feb 27 2026
I don't want us to make anything under x-wmf be stripped automatically. It limits ourselves in the future.
Feb 23 2026
Added an exception for translatewiki so they can keep syncing from gerrit at the previous rate, given that never created an issue for us.
Ah seen the script (thanks @Nikerabbit); first thing to do is make the user-agent compliant with the wikimedia User-Agent policy, so it should contain an email address or an url.
Do you happen to know what is the user-agent used by the script? Or point me to the source code for it.
Feb 20 2026
Feb 19 2026
Feb 18 2026
Feb 17 2026
Feb 6 2026
Hi @Salujapushpit - I'm the project maintainer, I have seen your MR, I will review it.
Feb 3 2026
I had resolved this bug quite some time ago.
Given how large the current set of actions is, this would result in a super clogged/unusable UI.
I think we have a better chance of actually integrating druid in some form within HP, rather than allowing to do this.
This has been indeed implemented already by @CDanis
Hi @Scott_French, I assume this task is resolved?
Not sure I get what you're referring to. The buttons are not javascript actions but simple submit buttons for HTML forms. They shouldn't be clickable more than once. Without reproductions steps I don't know what to do with this bug.
Feb 2 2026
Thank you for tagging this task with good first task for Wikimedia newcomers!
Jan 21 2026
I should add - it's also possible that actually huggle doesn't send auth cookies with every request; in that case, you get rate-limited to 100 requests over 10 seconds, unless you're using the tool from toolsforge.
Ah sorry, you're saying you're using bot-password above. I missed that. Uhm, this means that probably the hotfix I deployed to try to hotpatch the issue while T415007 is resolved isn't working. I'll investigate further tomorrow to see if my hotfix can indeed work.
@Framawiki the error message you get seems to indicate that your tool is not authenticated, but the UA is correctly detected as a bot:
Jan 20 2026
Jan 15 2026
I like the idea of having a full-stack view of traffic problems, especially on the media-serving side of things, which I'd focus on more at first. I would actually create two separate dashboards for text and media-serving.
Jan 13 2026
Hi, I still see a lot of requests from your IPs with user-agent Faraday v2.14.0.
Jan 12 2026
Jan 10 2026
I'll tag the task as solved, please let us know if you still enocounter problems. Thanks to @Reedy for the assistance over the weekend!
I have added an hotfix that should go live soon-ish (in the next hour or so).
I guess there's some bug in AutoWikiBrowser so that those requests lack authentication cookies; otherwise you wouldn't get that error, which can only happen for logged-out users.
Jan 9 2026
@Wikiwan I guess you got some message with the 429 responses, right? And those messages didn't suggest to come open a task, but rather to contact an email address.
To clarify:
This user agent is not compliat with our user-agent policy:
I'm not sure having your CI depend on external resources is a good policy; I encourage you to change that long-term, but anyways, we don't want to block the work on pywikibot right now.
Exception added. I allowed a generous amount of requests; please let us know if you still run into problems.
@Ragesoss I see you still get blocked from time to time; I will add an exception, per https://wikitech.wikimedia.org/wiki/Robot_policy#What_to_do_if_these_limits_are_too_strict_for_me?, using the IPs that you just provided.
Jan 8 2026
@Ragesoss as far as I can tell, the problem is you are not honoring the wikimedia User-Agent policy, and we have recently started to enforce stricter rate limits for bots that dont' respect the policy, hence you are being blocked.
@Ragesoss what is the User-Agent you use when making those requests?
Indeed T391397 is the cause of that other rule. I think we should just ban their UA completely.
In fact, we have a requestctl rule https://requestctl.wikimedia.org/action/cache-text/datadog_synthetics for rate-limiting them. The rule is only applied to cache misses, because we had no reason to block cache hits.
Datadog synthetics is a service from Datadog that we at some point had to block. Apparently some people like to use Datadog's agentic monitoring to check our infrastructure, for unforseen reasons. I wouldn't be sure this is not something fishy though, I'll look a bit more into it.
Jan 5 2026
@TheDJ the 429s are probably due to the massive scraping activity SREs have had to deal with over the festivities.
Dec 29 2025
Can you please report the full error message you get in the response body? Thanks
Nov 30 2025
A suggestion: when reporting the latencies, it would be useful to also have the standard deviation, which ab(1) provides, as the numbers seem so close to each other that the differences might well be within one standard deviation from each other between various solutions. That might help pick the one with the better size/performance tradeoffs better.
Nov 26 2025
Nov 25 2025
This problem is blocking KR work for WE5, responsible content reuse. Please treat the problem with some urgency, this code is preventing us from rate-limiting severely non-standard thumbnail size generation, which is one of the pillars of our work.
Nov 18 2025
I'm very happy you're going with the option @jijiki recommended, which sounds like both the path of least resistance and the best option.
Nov 6 2025
As my closing 2 cents about this whole system: