Recently, sharded-blockchain has attracted more and more attention. Its inherited immutabili-ty, decentralization, and promoted scalability effectively address the trust issue of the data sharing in the Internet of Things ( IoT ) . Nevertheless, the traditional random allocation between validator groups and transaction pools ignores the differences of shards, which reduces the overall system per-formance due to the unbalance between computing capacity and transaction load. To solve this prob-lem, a load balance optimization framework for sharded-blockchain enabled IoT is proposed, where the allocation between the validator groups and transaction pools is implemented reasonably by deep reinforcement learning ( DRL). Specifically, based on the theoretical analysis of the intra-shard consensus and the final system consensus, the optimization of system performance is formed as a Markov decision process ( MDP) , and the allocation of the transaction pools, the block size, and the block interval are jointly trained in the DRL agent. The simulation results show that the proposed scheme improves the scalability of the sharded blockchain system for IoT.