Family-based behavioral models capture the behavior of a software product line (SPL) in a single model, incorporating the variability among the products. In representing these models, a common technique is to annotate well-known behavioral modeling notations with features, e.g., featured finite state machine (FFSM) as an extension to the well-known finite state machine notation. It is not always the case that family-based behavioral models are prepared before developing an SPL, or kept up-to-date during the development and maintenance. Model learning is helpful in such situations. Taking advantage of the commonality among the SPL products, it is possible to reuse the product models in learning the behavior of the entire SPL. In this paper, the process of constructing FFSM models for SPLs is enhanced. Model learning is performed using an adaptive learning algorithm called PL*. Regarding the model learning step, we introduce a new heuristic method for determining the product learning orders with high learning efficiency. The proposed heuristic takes into account the complexity of features added by each product and improves the previous heuristics for learning order. To construct the whole family-based behavioral model of an SPL, the behavioral models of individual products are iteratively merged into the whole family-based model. A similarity metric is used to determine which states of the two models are merged with each other. By providing a formalization for the existing FFSMDiff algorithm for this purpose, we prove that in the FFSM constructed by this algorithm, the choice of the similarity metric does not affect the observable behavior of the constructed FFSM. We study the efficiency of three similarity metrics, two of which are local metrics, in the sense that they determine the similarity of two states only in terms of their adjacent transitions. On the other hand, a global similarity metric takes into account not only the adjacent transitions, but also the similarity of their adjacent states. It is shown by experimentation on two case studies that local similarity metrics can result in constructing FFSMs as concise as the FFSM resulting from the global similarity metric. The results also show that local similarity metrics increase the efficiency and scalability while maintaining the effectiveness of the FFSM construction.