The decision if users are matching is based on the threshold value. A score below the threshold means the users are likely to unmatch, score above means the users are (probably) matching (the same). The threshold is by-you-determined boundary on which you make decisions. It is based on specific false accept rate needs. The score values are normalized using formula -10*log10(FAR). It means that score 30 is roughly at far 1:1000 (1 false accepted face from 1000), score 50 is roughly at far 1:100000. The score value has a character of a logarithmic curve, so you cannot treat it as a linear indicator of the match. So e.g. scores 50 and 500 don't mean "a weak match" and "a super match" but a "good" and a "slightly better" match.
Keep in mind, the normalization formula is based on general data and your custom dataset will produce differents FARs so the threshold will need to be adjusted.