Actually, MaxDiff was originally designed to be able to work for individual respondents, so it scales down very nicely. We can get reliable utilities at the respondent level if we show each item say 4 times per respondent. So if you show 11 quads (sets of 4 items each) then each respondent will see each of the items 4 times.
Now, there's still the problem of generalizability (statistical inference): with a small sample there's an amount of sampling error that you're just not going to remedy but for increasing sample size.
But as far as your 40 respondents go, you'll be measuring their utilities very well.