{"library":"mabwiser","title":"MABWiser","description":"MABWiser is a Python library for parallelizable, contextual multi-armed bandits. It supports a wide range of bandit learning policies (e.g., epsilon-greedy, Thompson Sampling, LinUCB) and neighborhood policies for contextual bandits. Version 2.7.4 is the latest release, with active development.","language":"python","status":"active","last_verified":"Fri May 01","install":{"commands":["pip install mabwiser"],"cli":null},"imports":["from mabwiser.mab import MAB","from mabwiser.mab import LearningPolicy","from mabwiser.mab import NeighborhoodPolicy"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import numpy as np\nfrom mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy\n\n# Non-contextual bandit\narms = ['arm1', 'arm2']\nmab = MAB(arms, LearningPolicy.EpsilonGreedy(epsilon=0.1))\n# Simulate fitting: use dummy rewards\nfor _ in range(100):\n    arm = mab.predict()\n    reward = np.random.binomial(1, 0.7 if arm == 'arm1' else 0.3)\n    mab.partial_fit(arm, reward)\n\n# Contextual bandit with nearest neighbor\ncontexts = np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]])\nmab_ctx = MAB(arms, LearningPolicy.EpsilonGreedy(epsilon=0.1), NeighborhoodPolicy.Cluster())\nmab_ctx.fit(contexts, np.array(['arm1', 'arm2', 'arm1']), np.array([1, 0, 1]))\nprint(mab_ctx.predict(contexts[-1:]))","lang":"python","description":"Minimal example: non-contextual epsilon-greedy bandit with partial_fit, and contextual bandit with Cluster neighborhood.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":null}