The decision if users are matching is based on the threshold value. A score below the threshold means the users are likely to unmatch, score above means the users are (probably) matching (the same). The threshold is by-you-determined boundary on which you make decisions. It is based on specific false accept rate needs. The score values are normalized using formula -10*log10(FAR). It means that score 30 is roughly at far 1:1000 (1 false accepted face from 1000), score 50 is roughly at far 1:100000. The score value has a character of a logarithmic curve, so you cannot treat it as a linear indicator of the match. So e.g. scores 50 and 500 don't mean "a weak match" and "a super match" but a "good" and a "slightly better" match. 

Keep in mind, the normalization formula is based on general data and your custom dataset will produce differents FARs so the threshold will need to be adjusted.

FAR Threshold
1:10 10
1:100 20
1:1.000 30
1:10.000 40
1:100.000 50
1:1.000.000 60
1:10.000.000 70
1:100.000.000 80