The first step is to try and characterize what can cause this type of problem. Since this is related to selecting the correct language for sections of the code, start by considering the following:
- How is the language detected? Is it based on information from the HTTP request? Is it based on session information?, or is it based on database fields? In essence, can this be a problem related to how you app selects the language for each section?
- How is the language displayed? Are you pulling from a properties file, or a database? Is it possible the reference to the correct language is getting lost some how? Is the mixed in language you see always the default for the site?
- Is there a correlation to the client environment? This is related to the first bullet, but goes a bit further. I've had strange rendering problems due to downstream caching proxies. Typically those types of problems are a whole page that is stale or serving one person's page to other users (that was embarrassing).
- Are you using a Thread Local value? If a request is handled my more than one thread, the thread local value will have different information based on the thread that is working at the time. In a web server environment, you can't assume that the thread that you started processing on will be the same thread you complete processing on--unless that is part of the spec for your platform. Server writers have found that if they reuse a small pool of threads and multiplex work to them in chunks, they can handle more requests simultaneously. Even if you have one thread from the start to the finish of a request, the server may be multiplexing other requests on to that thread at the same time. Instead of thread locals, consider binding that value to the request or session attributes.
Now, once you've characterized the possibilities of what can go wrong, it's time to make sure you have the data you need to try and find out what did go wrong.
- Use profuse logging around the problem areas. This is a place where a tool like Log4J or Log4Net can really shine. That logging framework, and others like it, allows you to turn up the logging for certain categories while keeping down the noise for everything else--all by changing a configuration file. You want to introduce new logging statements to figure out if what you suspect could possibly be the problem. Also make sure your HTTP access logs have all the information you want about each request (cookies, http header parameters, etc.)
- Attempt to simulate the problem. Since this happens sporadically, what is the load like on the server at the time it does occur? Are you getting hit with a number of simultaneous requests from a mix of languages? If so, attempt to simulate that kind of load in your test environment. A tool similar to JMeter might be what you need. You'll also want to be able to spoof IP addresses for your fake clients. Remember that IP addresses are portioned out so that you can figure out what country/region the IP is based on the first two segments of the address.
- The problem will be just as sporadic in your test environment, but as you narrow down into your real cause you can skew the results to make it happen more often than it does in the wild. Additionally, you can more easily review the log files and try to learn from them.
- It's an iterative process, so be patient. You have to induce the type of load you think will reproduce the bug, check the logs, and refine your tests based on what you find. The important thing is to identify the problem, so resist the urge to make some simple fixes that might only make the real problem happen less often.
Finally, once you've narrowed down the problem to the point where you know how to reproduce it, and what causes it, write the smallest automated test you can to force the issue in code. If you've narrowed the problem down to one class, or a pair of classes not working together correctly, reproduce it at that level. You shouldn't have to spawn 100 threads to do it, just do the smallest test that can cause the issue to happen 100% of the time.
Now you can fix it, and be reasonably confident that it won't come back to bite you again.