The decision if the faces are matching is based on the threshold value. A score below the threshold means the faces are likely to unmatch, score above means faces are (probably) matching. The threshold is by-you-determined boundary which you decide on. It is based on specific false accept rate needs. The score values are normalized using formula -10*log10(FAR). It means that score 30 is roughly at far 1:1000 (1 false accepted face from 1000), score 50 is roughly at far 1:100000. The score value has a character of a logarithmic curve, so you cannot treat it as a linear indicator of the match. So e.g. scores 50 and 500 don't mean "a weak match" and "a super match" but a "good" and a "slightly better" match.
Keep in mind, the normalization formula is based on general data and your custom dataset will produce differents FARs so the threshold will need to be adjusted.