1. Review
There are no docstrings. What do all these functions do? How do I call them? What do they return?
The code is not portable to Python 3 because it uses xrange and the print statement.
The function isSquare is not used and could be removed.
The function lattice does not actually make use of the memoization (it doesn't look at the dictionary memo until it has already computed the result).
There's no need to use math.fabs as the vertices are all at integer points. You could use abs instead.
The program is hard to test because it has code that runs at the top level. This means that you can't run it from the interactive interpreter, or using timeit. It would be better to organize the code into functions to make it easier to test.
It's inconsistent for the function PolygonArea to have a name using CamelCase but other functions (lattice, boundary, interior) to have names using lower case.
The nested loops can be replaced by a single loop using itertools.product.
In order to work on the performance, it would be handy if the parameter \$m=100\$ were not hard-coded. We'd like to be able to run the code with smaller values of \$m\$ and see how the runtime behaves.
2. Revised code
from fractions import gcd from itertools import product def polygon_area(corners): """Return the area of the polygon with the given vertices.""" n = len(corners) area = 0.0 for i in range(n): j = (i + 1) % n area += corners[i][0] * corners[j][1] area -= corners[j][0] * corners[i][1] area = abs(area) / 2.0 return area _memo = {} def lattice(p1, p2): """Return the number of lattice points on the line from p1 to p2.""" t = abs(p2[0] - p1[0]), abs(p2[1] - p1[1]) if t not in _memo: g = gcd(t[0], t[1]) + 1 _memo[t] = g return _memo[t] def boundary(p1, p2, p3, p4): """Return the number of lattice points on the boundary of the quadrilateral with vertices p1, p2, p3, p4. """ return lattice(p1, p2) + lattice(p2, p3) + lattice(p3, p4) + lattice(p4, p1) - 4 def interior(p1, p2, p3, p4): """Return the number of lattice points in the interior of the quadrilateral with vertices p1, p2, p3, p4. """ corners = [p1, p2, p3, p4] A = polygon_area(corners) B = boundary(p1, p2, p3, p4) I = A - B/2 + 1 return I def problem504(m=100): """Return the number of quadrilaterals with vertices lying at lattice points on the coordinate axes, no more than m units from the origin, whose interior contains a square number of lattice points. """ squares = set(i ** 2 for i in range(2 * m)) count = 0 for a, b, c, d in product(range(1, m + 1), repeat=4): I = interior((a, 0), (0, b), (-c, 0), (0, -d)) if I in squares: count += 1 return count
3. Measurement
We expect the runtime to be proportional to \$m^4\$, so let's see how this performs on smaller values of \$m\$ and then fit a curve to figure out the projected runtime for \$m = 100\$. Note that because lattice is memoized, we have to be careful to reset the memoization each time.
>>> from timeit import timeit >>> _memo={}; timeit(lambda:problem504(10), number=1) 0.10902838699985296 >>> _memo={}; timeit(lambda:problem504(20), number=1) 1.7153344419784844 >>> _memo={}; timeit(lambda:problem504(30), number=1) 8.644499261979945 >>> _memo={}; timeit(lambda:problem504(40), number=1) 28.569149894057773
Let's fit these results to the equation \$t = am^4\$:

Gnuplot finds \$a = 1.11 × 10^{-5}\$, so we predict that when \$m=100\$, the runtime will be about 1110 seconds (about 19 minutes).
4. Performance improvement
The code here is general-purpose — for example polygon_area computes the area of any polygon. But in this problem we don't need the general case — we only have quadrilaterals whose vertices are at lattice points on the coordinate axes. So the obvious thing to do is to reduce the amount of work by only computing the bare minimum, and reusing as many results as we can.
The key observation is that we can compute the number of interior points in the quadrilateral from the number of interior points in the four right-angled triangles in the four quadrants, like this:

If I write \$T(a, b)\$ for the number of interior points in the triangle with vertices \$(0, 0)\$, \$(a, 0)\$ and \$(0, b)\$, then this diagram shows that the value we want is $$Q(a, b, c, d) = T(a, b) + T(b, c) + T(c, d) + T(d, a) + a + b + c + d - 3.$$ The value of \$T(a, b)\$ can be computed in various ways, for example using Pick's theorem. I like to use this diagram:

in which there are \$(a - 1)(b - 1)\$ dots altogether, of which \$\gcd(a, b) - 1\$ lie exactly on the hypotenuse (the grey dot), and the remainder divide into two equal halves (the blue and red dots), so $$T(a, b) = {(a - 1)(b - 1) - \gcd(a, b) + 1 \over 2}.$$ In this example, \$a=10\$ and \$b=12\$ so \$\gcd(a, b) = 2\$ and \$T(a, b) = 49\$.
There are now some micro-optimizations we can make:
Let \$U(a, b) = T(a, b) + a\$. Then $$Q(a, b, c, d) = U(a, b) + U(b, c) + U(c, d) + U(d, a) - 3$$ which avoids four additions.
Instead of creating a set of squares, create a set of squares plus 3. Then we can test \$U(a, b) + U(b, c) + U(c, d) + U(d, a)\$ to see if it is a square plus 3. This saves a subtraction.
Instead of maintaining a set of squares plus 3, create a list whose \$i\$th entry is True if \$i\$ is a square plus 3, and False otherwise. Fetching a list entry is faster than determining if an item is in a set.
That results in the following code:
def problem504a(m=100): """Return the number of quadrilaterals with vertices lying at lattice points on the coordinate axes, no more than m units from the origin, whose interior contains a square number of lattice points. """ square_plus_three = [False] * (4 * m * m + 3) for i in range(2 * m): square_plus_three[i * i + 3] = True R = range(m + 1) U = [[((a - 1) * (b - 1) - gcd(a, b) + 1) // 2 + a for a in R] for b in R] count = 0 for a, b, c, d in product(range(1, m + 1), repeat=4): if square_plus_three[U[a][b] + U[b][c] + U[c][d] + U[d][a]]: count += 1 return count
After all those micro-optimizations, we had better check that we haven't made a mistake. So let's test the new code against the old code:
>>> all(problem504(i) == problem504a(i) for i in range(20)) True
And see what the performance is like:
>>> timeit(lambda:problem504a(10), number=1) 0.007875622948631644 >>> timeit(lambda:problem504a(20), number=1) 0.09821957419626415 >>> timeit(lambda:problem504a(30), number=1) 0.49982553208246827 >>> timeit(lambda:problem504a(40), number=1) 1.6008274049963802 >>> timeit(lambda:problem504a(50), number=1) 4.018431055126712
The runtime is still \$Θ(m^4)\$ so here's the best fit to \$t = am^4\$ as before:

Gnuplot finds \$a = 6.4 × 10^{-7}\$ and so we predict that when \$m=100\$, the runtime will be about 64 seconds. In practice it takes a little longer than that:
>>> timeit(lambda:problem504a(100), number=1) 69.0101949761156