Recently, many software runtime systems have been proposed that allow developers to efficiently map applications to contemporary consumer electronic devices and high-performance academic processing platforms. Most of these runtime systems employ advanced scheduling techniques for automatic task assignment to all available processing elements. However, they focus on a particular environment and architecture, and it is not easy to port them to reconfigurable embedded MPSoCs. As a consequence, in the embedded community, researchers implement hardwired application-specific task schedulers, which can not be used by other embedded MPSoCs. To address this problem, in this paper we propose a lightweight runtime software framework for reconfigurable shared-memory MPSoCs, that integrate a master embedded processor connected to slave cores. Similarly to many of the aforementioned advanced runtime systems, we adopt a task-based programming model that uses simple, pragma-based annotations of the application software, in order to dynamically resolve task dependencies. Our runtime system supports heterogeneity in the hardware resources, and is also low-overhead to account for possible limitations in their processing capabilities and available on-chip memory. To evaluate our proposal, we have prototyped an MPSoC with seven slaves to a Xilinx ML605 FPGA board. We run three micro-benchmarks that achieve a performance speedup of 3.8x, 7x and 5.8x, and energy consumption of 27%, 14% and 18% respectively, compared to a single-core baseline system with no runtime support.