Let $k\in \mathbb N$
I'm looking for a small NFA build for the language of concatenation of two words of the length $k$ which are index-wise different, i.e. $$L_k=\{u\cdot v \in \Sigma^* : |u|=|v|=k\wedge \forall i, u_i\neq v_i\}$$
Notice that since $k$ is fixed, $|L_k|=(|\Sigma|\cdot(|\Sigma|-1))^k$, and is regular as a finite language.
$%$
The trivial DFA for the language contains $k$$|\Sigma| \choose k$$+1$ states and just "remembers" which letters it have seen during the first $k$ letters, however if $k=o(|\Sigma|)$, we can create a significantly smaller NFA.
$%$
A "simple" NFA for it would be of size $O^*(2^{2k})$ (more precisely, $O(k^2 \log |\Sigma| 2^{2k+O(\log^2 k)})$ ):
$%$
Take a $(|\Sigma|,2k)$-universal set (i.e. a set of vectors $\mathcal V\subseteq \{0,1\}^{|\Sigma|}$ such that for every vector $u\in \{0,1\}^2k$ and a $2k$ indices, there exists a vector $v\in \mathcal V$ such that $v_{|I}=u$, i.e. if we look only at these $k$ indices in $v$ we find $u$). Such families of size $O^*(2^{2k})$ are known.
We think of each vector $v$ as a function $\Sigma\to \{0,1\}$, where $v(\sigma_i)=1⟺v_i=1$.
$%$
Create an NFA as follows:
From the starting state $q_0$ create an $\epsilon$-transition to a new state for any $v\in \mathcal V$. Denote this state by $q_v$.
From each $q_v$ build a path of states which accepts all words whose first $k$ letters are mapped by $v$ to 0, and the later $k$ letters are mapped by $v$ to 1.
$%$
Basically, the idea is that the universal set allowes us to partition the letters which are allowed to appear in the first $k$ symbols from the rest, and we get every word in the language since there has to be a corresponding vector $v\in \mathcal V$ which partitions it right.
So the questions are:
What is the size of the smallest NFA for $L$?
Is this construction optimal?
What lower bound we can prove for such automaton size?