Integrated sensing and communication (ISAC) has become a key technology in the sixth-generation (6G) wireless networks, catering to the growing need for ubiquitous sensing and communication tasks. Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) can harness both reflective and refractive signals delivered. Due to orientation limitation of STAR-RISs, the multi-dual STAR-RISs (MD-STAR) is conceived to facilitate full-plane services in ISAC systems. In this paper, we intend to solve active beamforming of dual-function radar-communication (DFRC)-enabled BSs and passive beamforming of MD-STAR in ISAC systems, aiming for maximizing the achievable sum rate constrained by the maximum position error bound (PEB) as well as hardware limitation of MD-STAR. In order to solve this complex problem, we propose a two-layered multi-agent federated Q-learning (TMFQ) scheme. The inner layer Q-learning focuses on obtaining the solution of BSs and MD-STAR, whilst the outer layer Q-learning aims for optimizing the hyperparameters, including learning rate and discount rate of the inner-layer one. Additionally, we employ federated learning to facilitate information exchange between agents in the inner Q-learning. We evaluate our proposed TMFQ in terms of different numbers of MD-STAR elements, transmit antennas, and sensing targets. Benefiting from hyperparameter optimization of the inner layer Q-learning and information exchange of federated learning, the proposed TMFQ can achieve the highest rate compared to the other benchmarks, including Q-learning without hyperparameter optimization and without federated learning, heuristic algorithm, and conventional beamforming.