NOTE: This page is effectively read-only, except by the bot organizers. Please post any ideas and suggestions on the discussion page.
Development plan and future ideas
NOTE: The items below are thoughts for the future and are not included in the initial proposed specs.
See also: User:ProteinBoxBot/Project_proposals
Next up for implementation
- per discussion on Commons, add PDB infobox to all PDB images (Example [1])
- Run bot update
- pilot project for ((SWL))
- find some well-known facts
- encode them in Gene Wiki article using ((SWL))
- figure out synchronization with wikidraft.org/SMW, converting SWLs to real semantic links
- OUTPUT: demonstrate real inline queries on wikidraft.org
- OUTPUT: export from SMW to RDF
- pilot collaboration with MODs (specifically ZFIN)
- scan through all Gene Wiki pages for inline citations
- retrieve MeSH terms identify matching species (human, mouse, zebrafish, fly, rat, yeast)
- generate four-column output file:
- WP article name
- cited pubmed ID
- matching organisms by MeSH
- sentence(s) referencing the publication
- is there a MeSH-to-taxonomy mapping? or do free-text matching?
- for pubs that reference multiple species, one line per species
- for articles that reference a pub multiple times, concatenate sentences
- redesign infobox to better handle linking to MODs (MGD, RGD, ZFIN, FlyBase, WormBase, etc.)
Add additional links
- GeneCards
- nextbio.com?
- wikiprofessional
- wikigenes
- WikiPathways.org
- KEGG (also add wikilinks to other gene pages in the same KEGG pathways)
- HPRD
- link to Bioinformatic Harvester? -- would need community consensus...
Add/improve stub data (gene-specific)
- change format of the references section to make it small-screen friendly ([2])
- Add GeneRIFs and references from Uniprot
- import and display EC number
- import and display protein domain information (through Uniprot/PFAM/COGs) See previous discussion.
- UniProt fields: PFAM, "Protein name", "Synonyms", FUNCTION, DOMAIN, SUBCELLULAR LOCATION, CATALYTIC ACTIVITY, COFACTOR, SUBUNIT, and WEB RESOURCE
- Need to fix the db links for genome locations: default for mouse has gone to mm9 User_talk:ProteinBoxBot#Mouse_location_links_lack_db_name_parameter (need to either change default in template, or need to do a second pass run on all infoboxes to add parameter)
- Load PPI from Entrez Gene User_talk:ProteinBoxBot/Archives/Archive1#Interaction_partners
- Add a note in infobox showing last-updated date
- for GO section, add small note of evidence code and a link to Pubmed reference, if available.
- add image maps to thumbnail expression images so that tissues can be identified
- add a banner from gene talk pages to portal page ([3])
Add/improve stub data (structure)
Technical bot stuff
Parallel efforts
- upload all PDB to flickr? allows browsing of entire SCOP sub-trees. maybe geotag by location?
- create a WP category for every GO category? (Piggy back with Enzyme class effort?)
- expand to create pages for each disease using ((Infobox_Disease))
- second bot to wikilink common biology concepts, specifically on pages with PBB_Controls
- change ((Gene)) templates to internal wikilinks
- systematic creation of articles around protein domains (e.g., SMART database)
- Mass autogeneration of high-quality PDB images
Other
- look into HSPA1A and HSPA1B [7]
- automated way to create this table
- create a mac dashboard widget for the Gene Wiki?
- charting library to combine bar chart with background histogram... (not really Gene Wiki related...)