With the aim of evaluating our proposed algorithm performance, we compare it with some other algorithms named GAME1^{17}, GAME2^{24}, GAME3^{25}, SLPA^{22}, OSLOM^{26}, CPM^{27}, GCE^{28} and LFM^{29}. GAME1 is based on non-cooperative game theory with the time complexity of (O(m^2)). GAME 2 and GAME3 are based on cooperative game theory with the time complexity of (O(n^2)) and (O(n.log(n) )+O(n.k_{max})), respectively ((k_{max}) is graph maximum degree). Our algorithm results in this section are obtained by set (epsilon) value to 0.01. Other algorithms’ results are extracted from those algorithms’ original papers or comparative study papers^{30}. In these papers for algorithms with tunable parameters, it is stated that the results with the best setting are reported.

### Evaluation criteria

There are various metrics in order to evaluate obtained results of algorithms, and it is often challenging since no canonical solutions are available^{31}. A comprehensive discussion about the relationship between the topological properties of the community structure and the alternative evaluation measures and reliability of different evaluation criteria has been addressed in many studies^{32}. In the first place, choosing appropriate evaluation criteria depends on whether there is known ground truth for the examined network. In the cases with known ground truth, different evaluation measures, including Average F1 score (AvgF1)^{33}, Adjusted Rand Index (ARI), which ensures that the value of random clustering is close to zero, Omega Index^{34}, which is the overlapping version of ARI^{30} and adopts the number of clusters that each pair of nodes shares, to compare the detected communities versus ground truth communities, and Normalized Mutual Information (NMI)^{35}, derived from information theory, are widely used. In the current work, we used AvgF1 and an extended version of NMI, which is appropriate for comparison of two overlapping community structures^{29}. The closer value of NMI or AvgF1 to 1, the more similar the detected community structure to ground truth; and the 0 value indicates the least similarity.

When it comes to testing the performance of overlapping community detection algorithms, especially when the ground truth of communities is unknown, the (Q_{ov}) is a well-known and frequently used metric^{36}. It is an extension of the classical modularity, and the higher value of this means the better-detected communities. For directed networks this metric is defined as follows:

$$begin{aligned} Q_{ov}=frac{1}{m} sum _{c in C} ; sum _{i,j in V} left[ beta _{l(i,j),c}A_{j,j}-frac{beta _{l(i,j)}^{out}k_{i}^{out}beta _{l(i,j)}^{in}k_{j}^{in}}{m} right] end{aligned}$$

(8)

By applying minor changes as follows, it can be used for undirected networks:

$$begin{aligned} Q_{ov}=frac{1}{2m} sum _{c in C} ; sum _{i,j in V} left[ beta _{l(i,j),c}A_{j,j}-frac{beta _{l(i,j)}^{‘}k_{i}beta _{l(i,j)}^{‘}k_{j}}{2m} right] end{aligned}$$

(9)

The components of this equation is given by:

$$begin{aligned} beta _{l(i,j)}^{‘}=beta _{l(i,j)}^{out}= & {} beta _{l(i,j)}^{in}= frac{sum _{i in V}F(alpha _{i,c},alpha _{j,c})}{|V|} end{aligned}$$

(10)

$$begin{aligned} beta _{l(i,j)}= & {} F(alpha _{i,c},alpha _{j,c}) end{aligned}$$

(11)

$$begin{aligned} k_{i}^{out}=k_{i}^{in}= & {} k_{i} end{aligned}$$

(12)

$$begin{aligned} F(alpha _{i,c},alpha _{j,c})= & {} frac{1}{(1+e^{f(alpha _{i,c})})(1+e^{f(alpha _{j,c})})} end{aligned}$$

(13)

$$begin{aligned} f(x)=2px-p, p in {mathbb {R}} end{aligned}$$

(14)

where (alpha _{(i,c)}) is the belonging coefficient of node *i* to community *c* and p in *f*(*x*) is an arbitrary value that in the current study is set to 30.

### Synthetic networks

One of the most famous benchmark networks is synthetic networks called LFR which can be generated by the method proposed by Lancichinetti and Fortunato^{37}. While in real-world networks, degree correlation among nodes is clearly nonzero, and the transitivity is relatively high, networks generated by LFR method have near-zero degree correlation and low transitivity^{38,39,40}. Despite this drawback and some other limitations of LFR method, these networks still exhibit relatively very high realistic properties, and considering a large amount of experimental data available from the test of other algorithms on them, LFR networks are among the most proper choices for community detection algorithms performance test. In the networks made by this method 10 parameters are adjustable. By setting these parameters, we generated 6 groups of LFR networks for the performance tests, as shown in Fig. 5. The mixing parameter (mu) refers to the fraction of links through which a node connects to other nodes in other communities; (k_i^{in}=(1-mu )k_i). (tau _1) and (tau _2) are exponents of power-law distribution of node degrees and community sizes, respectively. Furthermore, overlapping features of LFR network are controlled by *Om* (the number of communities to which each overlapping node belongs) and *On* (the fraction of nodes that belongs to more than one community). It should be noted that for our algorithm performance test on LFR networks, we have reported averaged results of runs over at least 10 instantiations of these networks for each parameter set.

The NMI values for results obtained using our proposed and other algorithms are represented in Fig. 6. As expected, by increasing *Om,* the NMI values gradually decrease. However, it is observed that in most cases, our algorithm outperforms others, especially in synthetic networks with smaller community sizes and more overlapping nodes.

When it comes to networks with overlapping communities, evaluation of a community detection algorithm performance must include checking the number of identified overlapping nodes, which is one of the important parameters determining the algorithm’s accuracy. Overlapping nodes play a crucial role in real-world social networks considering the fact they usually act as bridges or messengers between communities^{30}. Identified On detected by proposed and other algorithms for two groups of LFRs with ground truth On of 0.1 and 0.5 are shown in Fig. 7. Overlapping nodes identified by our algorithm increase gradually by the increase of *Om*. This trend is in contrast with other algorithms except SLPA in LFR3 network.

Aiming to find more comprehensive insight into algorithms performance, it would be beneficial to investigate the distribution of detected community sizes (*CS*). For this purpose, we used algorithms results on LFR3 averaging on all values *Om* and 10 instantiations of these networks. In the histogram of community sizes which is shown in Fig. 8, small fluctuations were omitted by representing fitted curves instead of raw data. For comparison, the ground truth power law distribution is visible in each histogram. Except for ours and SLPA algorithms, other algorithms have remarkable weaknesses in detecting larger size communities. Besides, some algorithms tend to break communities into smaller parts that cause distribution concentration in the range of small communities which do not exist in real distribution. Although such miss clustering occurs to some extent by our algorithm, it is not as much as some other algorithms such as GAME1, LFM, and especially CPM and OSLOM. Particularly, results demonstrate the relatively better performance of our algorithm in detecting larger communities.

### Real networks

In order to further evaluation of the proposed algorithm, we tested its performance on some real-world networks. Eight real networks have been chosen for this test, and their description can be observed in Fig. 9 (Data for the last three larger networks are available at http://snap.stanford.edu). As an evaluation measure, for the first six networks and for the last two ones, the overlapping modularity and AvgF1 score were used, respectively.

Stack bar chart of (Q_{ov}) for obtained community structure of first six networks by ours and other algorithms are shown in Fig. 10. Such illustration makes us able to compare the overall performance of algorithms on all six networks. Our algorithm gets (Q_{ov}) value for Dolphins, Football, Polbooks, and PGP, which is slightly higher than other algorithms. Moreover, the sum of (Q_{ov}) obtained by our algorithm is higher than the others. As an example, the community structure of the karate network, which is obtained by our algorithm, is shown in Fig. 10. This network is of traditional importance and was studied by Wayne W. Zachary for three years, from 1970 to 1972^{41}. The ground truth of this network that was observed by Zachary contains two communities represented in Fig. 10. As it can be seen, the detected community structure is exactly fitted to ground truth if excluding node 10. However, locating node 10 in the overlapping of two communities is sensible, considering its equal connection with both.

For the last two larger networks, which have know community structure, the bar chart of AvgF1 scores for obtained community structure by ours and other algorithms are shown in Fig. 11. For these networks, in addition to previously used algorithms, the result of BigClam^{33} and GLEAM^{5} algorithms are represented for comparison. Data related to other algorithms’ performance on these two networks are extracted from GLEAM algorithm’s original paper^{5}. Based on the results represented in 11, it can be seen that the proposed algorithm, along with the GLEMAo algorithm, has the best performance in the detecting community structure of these two networks.