Wikipedia:Bot requests

From Wikipedia, the free encyclopedia
  (Redirected from Wikipedia:BOTREQ)
Jump to navigation Jump to search

This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page).

You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard.

Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Wikipedia community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference).

Alternatives to bot requests

Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}, and archive the request after a few days (WP:1CA is useful here).

Please add your bot requests to the bottom of this page.
Make a new request

Lists of new articles by subject[edit]

My kingdom for a bot that compiles new articles in a new subject area (e.g., added to a WikiProject's scope). @PresN, currently runs a script that does this manually (see one of the "New Articles" threads at WT:VG) but would love to be able to do this for other projects so that new editors get visibility/help and that the project can see the fruits of its efforts. (Also discussed at PresN's talk page.) Special:Contributions/InceptionBot currently finds articles that might be within scope but this proposal is instead a log of recent additions to a topic area (similar to how the 1.0 project compiles). It could be useful if delivered directly to a WikiProject/noticeboard page or, alternatively, updated on a single page and transcluded à la WP:Article alerts. czar 20:07, 15 December 2019 (UTC)

@Czar: This seems like a fairly simple task, but I want to make sure I have all the details right: For every wikiproject that opts-in, each week, generate a list of articles that had that wikiproject's tag added to their talk page within that week. Information about the article should be included, including importance and quality rating and author. Should non-articles (cats, templates, files, etc) be considered as well, or just articles? What about drafts? Are newly-created redirects important? Do you want articles that were removed from the WikiProject, deleted or redirected too (this would make it more complex)? --AntiCompositeNumber (talk) 20:04, 3 February 2020 (UTC)
@AntiCompositeNumber, yes, that's right! Can detect on the addition of the template or the addition to the category associated with the WikiProject (i.e., ArticleAlerts uses a combination of the banner and the talk category). I'd recommend including the quality rating but excluding the importance, à la {{article status}}. I'd recommend cutting scope to only include articles to keep the v1 reasonable. (Let someone request the extras if they have a valid case, but ArticleAlerts currently lists relevant deletions and AfC drafts.) In WT:VG#New Articles (January 27 to February 2), as an example, I personally don't find the category reports useful. The goal of this bot, to my eyes, is to make WikiProject talk pages closer to topical noticeboards, so editors interested in a topic receive a digest of new article creations to pitch in either to contribute to the article or simply to welcome new/isolated editors. So while I recommend against listing files, cats, templates, drafts, importance, deletions, or redirects, if there is any stretch goal, I'd particularly recommend incorporating InceptionBot's possibly related articles, in case there are new articles from the last week that may be eligible for the project but just haven't been tagged. But, yes, core function is just new, on-topic articles. czar 03:07, 8 February 2020 (UTC)
Czar, how does this (for WP:VG) look? I'll add some explanatory text to it before calling it ready, but wanted to get your thoughts on what the output looks like now. I haven't written the bot part of the bot yet, just the category analysis. The report's only based on the category and doesn't particularly care about the template since that data is much more accessible. Also, do you know of wikiprojects that would be interested in the reports? --AntiCompositeNumber (talk) 22:34, 8 February 2020 (UTC)
@AntiCompositeNumber, love it. Looks great! I'd start with WT:VG just for testing and can advertise/expand it to other projects. (FYI @PresN, curious if this test is missing any articles you'd normally catch) czar 00:31, 9 February 2020 (UTC)
@AntiCompositeNumber and Czar: Ah, the machines are coming for my (machine's) job... Yeah, there's some differences between this bot and my script's output for the week:
  • Source 2 on Feb 5 is missing- my script lists this because it was previously a redirect (for 2+ years) and was converted to a 'real' article on that day
  • Jablinski Games on Feb 7 is missing - was created as a redirect on Jan 30 with no talk page tag (so, not in the project), then converted to a 'real' article on Feb 7 and a talk page tag added.
  • You have Animal Crossing Plaza on the 2nd when it was created/tagged on the 1st, but I think that might just be a time-zone issue
  • You list Candy Crush Saga as new on Feb 1, when it's years old; this appears to be because of a crazy revert war with a vandal on the talk page on Jan 31.
So, from this limited sample set, it appears the main miss is considering redirect->!redirect as a 'creation', and not discounting the 'creation' of an existing page. That said, I fully expect there to be weirdness around page moves and double-page moves as well, but those are smaller corner cases. The other major difference is that my script would have listed 17 new categories as well (in addition to listing new article deletions and redirections/moves to draft space (aka soft deletions) this week, and (none this week) new templates/template deletions). --PresN 05:36, 9 February 2020 (UTC)
@PresN and Czar: Jablinski Games and Animal Crossing Plaza are both just time zone issues. The bot considers the last 7 full days in UTC, so Animal Crossing Plaza was tagged at 01:05 and Jablinski shows up today.
The other two are because of the data source. My tool is only querying the categorylinks database table for recent additions of the category, so it doesn't pick up redirect -> article conversions. WP 1.0 Bot gets around this by logging article metadata into it's own database, but that data isn't super accessible outside of parsing the on-wiki logs (afaict). The categorylinks table only cares about the page id, not the page title, so moves don't affect it. So while there is data for new catgorization of drafts, I won't see articles that were previously tagged and were moved to mainspace. There is, of course, data for the tagging of drafts, files, categories, etc: I'm just ignoring it. Listing articles currently tagged for AfD or PROD wouldn't be too difficult, it's just a category intersection. Code if you're curious --AntiCompositeNumber (talk) 16:13, 9 February 2020 (UTC)
@AntiCompositeNumber: OK, so assuming I'm reading this right, there's not really a good way to get un-redirects; they'd show up when the redirect is first created (which isn't ideal as most redirects never get undone, and most un-redirects are years later) but that's it. Same for draft->mainspace, but no issues with page moves. So your version would cover the majority of cases, but would miss those edge cases. That's probably fine for most wikiprojects, though- my non-data-based feeling is that it's the media projects that have the most "article created, redirected, and later re-created" occurrences, whereas projects that get less attention from eager fans don't get as many articles created prematurely.
Your code is definitely more readable than my spaghetti nonsense, though- for an example of what happens if you try to base this off of the WP1.0 bot output and then compound it by actually just parsing the html of Wikipedia:Version 1.0 Editorial Team/Video game articles by quality log directly without any sort of api access and then make it worse by parsing top to bottom aka reverse temporal order, here's the python function that does the logic of building the list of article objects that appear to have been created in the date range given:
Extended content
  def parse_lists(lists, headers, assessments, new_cats, dates, dates_needed):
    NULL_ASSESSMENT = '----'
    max_lists = dates_needed * 4
    extra_headers = get_extra_headers(headers) # Note "Renamed" headers

    # Initial assessment
    for index, list in enumerate(lists):
      if index <= max_lists:
        for item in list.find_all('li'):
          contents = _.join(item.contents, ' ')
          offset = count_less_than(extra_headers, index) - 1
          date = dates[int(max((index-(1 + offset)), 0)/3)] #TODO: handles 3+ sections
          assess_type = assessment_type(contents)
          # Assessment
          if assess_type == ASSESSMENT:
            namespaced_title = get_title(item, ASSESSMENT)
            title = clean_title(namespaced_title)
            old_klass = NULL_ASSESSMENT
            new_klass = get_newly_assessed_class(item, namespaced_title)
            if (not is_file(namespaced_title)
            and not is_redirect_class(new_klass)
            and not (title in assessments and was_later_deleted(assessments[title]))): # ignore files, redirects, and mayflies
              if is_category(namespaced_title):
                init_cat_if_not_present(new_cats, namespaced_title)
                init_if_not_present(assessments, title)
                assessments[title]['creation_class'] = new_klass
                assessments[title]['creation_date'] = date

          if assess_type == REASSESSMENT:
            namespaced_title = get_title(item, REASSESSMENT)
            title = clean_title(namespaced_title)
            old_klass = get_reassessment_class(item, 'OLD')
            new_klass = get_reassessment_class(item, 'NEW')
            if not is_file(namespaced_title):
              init_if_not_present(assessments, title)
              if is_redirect_class(new_klass): # tag redirect updates as removals, unless later recreated
                if not (is_draft_class(old_klass) and 'creation_class' in assessments[title]): # Ignore if this a a draft-> mainspace move in 2 lines
                  assessments[title]['was_removed'] = 'yes'
              elif is_redirect_class(old_klass): # treat redirect -> non-redirect as a creation
                assessments[title]['creation_class'] = old_klass
                assessments[title]['updated_class'] = new_klass
                assessments[title]['creation_date'] = date
              else: # only add the latest change, and only if there's no newer deletion
                if 'updated_class' not in assessments[title] and not was_later_deleted(assessments[title]):
                  assessments[title]['updated_class'] = new_klass

          # Rename
          if assess_type == RENAME:
            namespaced_old_title = get_rename_title(item, 'OLD')
            namespaced_new_title = get_rename_title(item, 'NEW')
            if not is_file(namespaced_new_title) and not is_category(namespaced_new_title):
              new_title = clean_title(namespaced_new_title)
              if is_draft(namespaced_old_title) and not is_draft(namespaced_new_title):
                init_if_not_present(assessments, new_title)
                if not was_later_updated(assessments[new_title]) and not was_later_deleted(assessments[new_title]):
                  assessments[new_title]['creation_class'] = DRAFT_CLASS
                  assessments[new_title]['updated_class'] = "Unassessed"
                  assessments[new_title]['creation_date'] = date
              if is_draft(namespaced_new_title) and not is_draft(namespaced_old_title):
                init_if_not_present(assessments, new_title)
                if not was_later_updated(assessments[new_title]) and not was_later_deleted(assessments[new_title]):
                  assessments[new_title]['creation_class'] = "Unassessed"
                  assessments[new_title]['updated_class'] = DRAFT_CLASS
                  assessments[new_title]['creation_date'] = date

          # Removal
          if assess_type == REMOVAL:
            namespaced_title = get_title(item, REMOVAL)
            # Articles
            if not is_file(namespaced_title):
              title = clean_title(namespaced_title)
              if title not in assessments: # don't tag if there's a newer re-creation
                assessments[title] = { 'was_removed': 'yes' }
                if is_category(namespaced_title):
                  assessments[title]['creation_class'] = CATEGORY_CLASS
                if is_draft(namespaced_title):
                  assessments[title]['creation_class'] = DRAFT_CLASS
            # Categories
            if is_category(namespaced_title) and namespaced_title not in new_cats:
              new_cats[namespaced_title] = 'was_removed'

    return {'assessments': assessments, 'new_cats': new_cats}
--PresN 04:25, 10 February 2020 (UTC)
Hi @AntiCompositeNumber, checking back—need anything else from us? czar 04:36, 25 February 2020 (UTC)
I don't think so, but I'll need to sit down and look at everything again. I'll probably have a chance for that sometime in the next two weeks. --AntiCompositeNumber (talk) 00:11, 2 March 2020 (UTC)

Please remove residence from Infobox person[edit]

Hi there, re: this permalinked discussion, could you stellar bot handlers please remove the |residence= parameter and subsequent content from articles using {{Infobox person}}? Per some of the discussions, Category:Infobox person using residence might list most of the pages using this template. And RexxS said:

"Using an insource search (hastemplate:"infobox person" insource:/residence *= *[A-Za-z\[]/) shows 36,844 results, but it might have missed a few (like {{plainlist}}); there are at least 766 uses of the parameter with a blank value."

I don't know if this helps. This is not my exact area of expertise. Thanks! Cyphoidbomb (talk) 05:31, 27 December 2019 (UTC)

I explained my objection to this proposal in the Removal section immediately below the closed discussion in the permalink above. It is not a good idea to edit 38,000 articles if the only objective is a cosmetic update. Further, there is no rush and the holiday season is not a good time to make a fait accompli of an edit to the template performed on Christmas Day. Johnuniq (talk) 06:40, 27 December 2019 (UTC)
Cyphoidbomb, I already have a bot task that can handle this, but it sounds like there is some contention about the actual removal, so ping me somewhere if and when the decision about how to deprecate the param is finished. Primefac (talk) 16:24, 27 December 2019 (UTC)
@Primefac and Johnuniq: OK, I'm certainly in no hurry. Cyphoidbomb (talk) 16:34, 27 December 2019 (UTC)
I just want to add support for this. I'm quite tired of seeing the residence error when I do quick previews before saving edited bios. МандичкаYO 😜 11:02, 8 February 2020 (UTC)

A heads up for AfD closers re: PROD eligibility when approaching NOQUORUM[edit]

Revisiting this March discussion for a new owner

When an AfD discussion ends with no discussion, WP:NOQUORUM indicates that the closing admin should treat the article as an expired PROD ("soft delete"). As a courtesy/aid for the closer, if would be really helpful for a bot to inform of the article's PROD eligibility ("the page is not a redirect, never previously proposed for deletion, never undeleted, and never subject to a deletion discussion"). Cribbing from the last discussion, it could look like this:

  • When an AfD listing begins its seventh/final day (almost full term) with no discussion, a bot posts a comment on whether the article is eligible for soft deletion by checking the PROD criteria that the page:
    • isn't already redirected (use API)
    • hasn't been PROD'd before (check edit summaries and/or diffs; or edit filter if ever created)
    • has never been undeleted (check logs)
    • hasn't been in a deletion discussion before (check page title and talk page banners)
    • nice-to-have: list prior titles for reference, if the article has been moved or nominated under another name before
  • To check whether anyone has participated in the AfD, @Izno suggested borrowing the AfD counter script's detection

This would greatly speed up the processing of these nominations. Eventually would be great to have this done automatically, but even a user script would be helpful for now. czar 19:26, 29 December 2019 (UTC)

@Czar: Is it good enough if a bot just reports these attributes for AfD expired with no discussion?
  1. Whether the page is redirected or not
  2. List up all previous WP:AfD and WP:AFU with the results.
--Kanashimi (talk) 05:52, 24 January 2020 (UTC)
I was thinking that a more general scoped bot which tells the AFD whether there were previous redirectings, (un)deletions and deletion discussions might be useful to inform the discussion of past changes. Jo-Jo Eumerus (talk) 09:23, 24 January 2020 (UTC)
@Kanashimi, that would cover 75% of the criteria a closer needs to know (and would at least be a start!) so would need to remind the closer to check the page history for prior PRODs as well. Otherwise, yes, that's exactly what I think would work here. Essentially, if it detects positive for any of those criteria, would be nice to summarize that it's ineligible for soft deletion because of x criterion. czar 13:03, 25 January 2020 (UTC)

@Czar: For Wikipedia:Articles for deletion/Log/2020 February 3, I extract information like this: report. Is the information enough? --Kanashimi (talk) 10:06, 4 February 2020 (UTC)

@Kanashimi, it's a start! I was thinking of formatting along the lines of:
Extended content

posting something like this to the AfD discussion when no one else has !voted

In this case, wouldn't need to list the entire history but just say at a glance (or the strongest reason) why the article isn't eligible for soft deletion. Eh? czar 02:46, 8 February 2020 (UTC)
@Czar: How about this report? --Kanashimi (talk) 11:13, 8 February 2020 (UTC)
@Kanashimi, this is great! If it ran at the beginning of the 7th day of listing for Articles for deletion/Vikram Shankar and Articles for deletion/Anokhi, which for lack of participation would appear eligible for soft deletion, the closer would know that it's not actually the case. I'm not sure that the list of deletions/undeletions is needed for this case but open to other opinions. At the very least, pictorial image use is historically discouraged in AfD discussions. A few fixes:
  • The Wikipedia:Articles for deletion/List of Greta Thunberg speeches would need a tweak. What would make it ineligible is if the existing article (under discussion) was redirected elsewhere, leaving its history in the same location, meaning that someone redirected it in lieu of deletion. In this case, the article (and its page history) was moved to a new location, so this case should check both whether the title redirects AND whether the page history remains. Page moves would still be eligible for soft deletion/expired PROD by my read.
  • Tok Nimol is presented as undeleted but its log doesn't show a restoration?
  • Reem Al Marzouqi: The rationale for this should not be "previously deleted" but specifically "previously discussed at AfD", which supersedes whether or not it was deleted. (Deletion itself doesn't make the article ineligible—e.g., Hasan Piker and Heed were each only deleted through CSD—but specific signs that someone has previously considered the article ineligible for PROD.) Same applies to the remaining "2nd+ nomination"s listed.
  • And of course there's the caveat that the script wouldn't have actually run on most of these (all but four?) since the rest had at least some participation.
  • Will this script catch whether the article was previously PROD'd? If not, would want to add something to the text to remind the closer to check. The rationale for Heed (cat)'s ineligibility, for example, is that the article was previously PROD'd and contested (03:32, 1 June 2009), not that it was previously deleted via CSD. Same for Paatti, which actually shows the PROD in the log (most do not, to my understanding).
  • Ayalaan actually appears eligible for soft deletion. Its deletion was through CSD and it appears to have not been previously PROD'd. As long as the script confirmed that the article was not tagged for PROD before, this would be a great case of where the script could say that the article appears eligible.
Thanks for your work on this! It's going to be really helpful. czar 14:18, 8 February 2020 (UTC)
  • @Czar: I fixed some bugs and generate 1, 2, 3, 4, 5.
Ayalaan: Do you mean that, all CSD is not taking into account? If so, it is easy to fix it.
Heed (cat): It seems not easy to parse comments, and it is expensive to fetch all revisions. So I have not decided yet.
Please check the results and tell me if there are still some things to fix. --Kanashimi (talk) 08:01, 9 February 2020 (UTC)
Yep, CSD/BLPPROD doesn't affect PROD/soft deletion eligibility (WP:PROD#cite_ref-1), so don't need to track that.
If the v1 won't parse edit summaries or diffs, I've modified the collapsed section above with some suggested boilerplate. Of course, would be great if it could, but this would do for now.
It looks like all of those results would not run because the bot detects participation for each? The case of Madidai Ka Mandir should let the bot run since the only participation is from a delsort script. Henri Ben Ezra should let the bot run too (to post that it's ineligible based on having a prior AfD). So would need to tighten participation detection. If the bot/script is detecting one or fewer delete/redirect participations, the bot should run (e.g., Ayalaan and Anokhi). Probably also want the bot to only run when the nom hasn't been relisted, or else it could potentially run twice on the same nomination. 
As for the logs, related discussions, and previous discussions, I think it might be overkill to post these. It could be potentially interesting as its own bot task, if there is consensus for it, but I think simply showing "soft deletion" eligibility is sufficient for this task. I'll ask Wikipedia talk:AfD for input. czar 16:42, 9 February 2020 (UTC)
The latest version: 1, 2, 3, 4, 5.
Please check the results and tell me if there are still some things to fix. ---Kanashimi (talk) 00:57, 10 February 2020 (UTC)
I didn't check all logs, but from the ones I did, the log analysis looks good! It doesn't look like the tests were doing "no quorum" detection, so as long as the script knows when it should run on a discussion (one or zero !votes in the last 24 hours of the AfD's seven-day listing) then sounds good to proceed to the next step/trial. Thanks! czar 01:49, 17 February 2020 (UTC)
@Czar: If you think it is good enough, I will file a bot request. I will generate some reports at sandbox next days. --Kanashimi (talk) 23:14, 17 February 2020 (UTC)
@Kanashimi, sounds good czar 04:36, 25 February 2020 (UTC)
@Czar: BRFA filed --Kanashimi (talk) 11:32, 26 February 2020 (UTC)

Consolidating multiple WikiProject templates into taskforces of template:WikiProject Molecular Biology[edit]

Related post: Wikipedia:Bot_requests/Archive_79

I'm in need of help replacing all instances of a set of WikiProject templates as taskforces of the one unified template: {{WikiProject Molecular Biology}}. Unfortunately a simple transclusion of the new template wrapped in the old templates isn't enough, since some pages have multiple WikiProject templates, so will need to be marked with multiple taskforces. It's therefore similar to when Neurology was merged into WP:MED.

Example manual edit:


{{WikiProject Molecular and Cellular Biology|class=GA|importance=high|peer-review=yes}}
{{WikiProject Computational Biology|importance=mid|class=GA}}


{{WikiProject Molecular Biology|class=GA|importance=high|peer-review=yes
  |MCB=yes     |MCB-imp=high
  |COMPBIO=yes |COMPBIO-imp=mid

Broadly, I think the necessary bot steps would be:

  1. If {{WikiProject Molecular and Cell Biology}} OR {{WikiProject Genetics}} OR {{WikiProject Computational Biology}} OR {{WikiProject Biophysics}} OR {{WikiProject Gene Wiki}} OR {{WikiProject Cell Signaling}}
    Then add {{WikiProject Molecular Biology}}
  2. For {{WikiProject Molecular and Cell Biology}} AND {{WikiProject Genetics}} AND {{WikiProject Computational Biology}} AND {{WikiProject Biophysics}} AND {{WikiProject Gene Wiki}}
    Remove {{WikiProject MCB/COMPBIO/Genetics/Biophysics/Gene Wiki|importance=X|quality=y}}
    Add |MCB/COMPBIO/genetics/biophysics/Gene Wiki=yes + |MCB-imp/COMPBIO-imp/genetics-imp/biophysics-imp/GW-imp=X (note: GW → Gene Wiki)
  3. For whichever WikiProject template has the highest |importance= and |quality=, add that as the overall |importance= and |quality= to {{WikiProject Molecular Biology}}
  4. Additionally add to articles in the following categories:

Thank you in advance! T.Shafee(Evo&Evo)talk 07:09, 12 January 2020 (UTC) (refactored/edited by Seppi333 (Insert ) 05:48, 18 January 2020 (UTC))

We posted threads about this on different pages at the same time, so I figured I'd follow-up here as well. I can implement this myself using template wrappers and/or a new bot (re: Wikipedia talk:WikiProject Molecular Biology#Template:WikiProject Molecular Biology, as described in the sub-section); I just need a little more feedback from WT:MOLBIO. That said, you've more or less answered my question on how to do it here. Seppi333 (Insert ) 01:56, 16 January 2020 (UTC)
@Primefac: You mentioned in the earlier thread on this topic that one can use Anomiebot to merge templates using {{Subst only|auto=yes}} template to merge one banner into another, but is there any support for merging multiple banners on a single page into 1? If not, are there any bots that have been approved to merge multiple project banners on talk pages (particularly where 2+ banners occur on a single page) into a single parent banner? Asking because I could likely modify the source code of a bot designed to merge the banners of another project's task forces for this purpose, especially if there's one written in python. Seppi333 (Insert ) 03:45, 16 January 2020 (UTC)
@Seppi333: You make a good point about what to put as overall WP:MOLBIO class and importance based on WP:MCB, WP:GEN etc. at WT:MOLBIO. I think the best option is to simply use the current taskforce importance (if something's high importance to the WP:GEN taskforce, chances are it's high importance to the WP:MOLBIO wikiproject). The edge case is when two taskforces currently indicate different importance levels (e.g. Talk:DNA_gyrase). In such cases it might be safest to use the median rounded up for the overall importance (high+low→mid, high+mid→high), but maybe that's over complicating things. T.Shafee(Evo&Evo)talk 04:45, 16 January 2020 (UTC)
Sure, I ran a bot like this last weekend. I could probably put in a BRFA today or tomorrow if I get time. Primefac (talk) 10:55, 16 January 2020 (UTC) I did just notice, though, that there are also sub-projects for each of the (now) sub-projects; are those tasks forces (such as genetic engineering or education) being handled by the replacement template as well? Primefac (talk) 10:59, 16 January 2020 (UTC)
@Primefac: No, don't think so. The primary reason the Gene Wiki sub-task force was added is that it has its own banner (w/ corresponding article categories: {{WikiProject Gene Wiki}} & Category:Gene Wiki articles) which is currently present on ~1800 pages. I think we're probably just going to go with the current task force listing in the {{WPMOLBIO}} template.
@Evolution and evolvability: I added the signaling parameter for categorizing cell signaling articles; Category:Metabolism is an article category and the metabolic pathways task force doesn't have its own category, so I couldn't add the metabolism one.
Addendum, re: The edge case is when two taskforces currently indicate different importance levels (e.g. Talk:DNA_gyrase). In such cases it might be safest to use the median rounded up for the overall importance (high+low→mid, high+mid→high), but maybe that's over complicating things.. It wouldn't be that technical to encode that. Programatically, one just needs to ordinally encode low→1, mid→2, high→3, top→4 (NB: this method implicitly assumes that there's an equal "importance distance" in a mathematical/statistical sense between importance ratings, which might not necessarily be true - it depends on how people go about rating importance on average), then use round(median(list of ratings)) or round(average(list of ratings)), then remap whatever number it returns back to an importance rating. E.g., the average rating of task forces that rate an article as low, high, and top is (1+3+4)/3, which would be rounded to 3 → high importance. Seppi333 (Insert ) 02:56, 18 January 2020 (UTC)

@Evolution and evolvability: I refactored the request in Special:Diff/936327774/936342626 to reflect the changes to the template. You might want to look it over just to make sure nothing seems off. Seppi333 (Insert ) 05:48, 18 January 2020 (UTC)

@Seppi333: That looks correct to me! Great to see it coming together. I'll also go through the taskforce pages and relevant template documentation over the next few days to make sure the instructions for tagging new articles is up to date (example). T.Shafee(Evo&Evo)talk 23:11, 22 January 2020 (UTC)
@Primefac: Are you still interested in doing this? Either way, can you point me to the bot script you had in mind in the event I have a need for reprogramming it to run a similar bot in the future? Seppi333 (Insert ) 04:42, 21 February 2020 (UTC)
This seems to meet the criteria for Task 30, so I should be able to get to it this weekend. Primefac (talk) 11:59, 21 February 2020 (UTC)

Correcting links to Mexican Federal Telecommunications Institute documents[edit]

This is simple and can be handled by just about any bot.

The Federal Telecommunications Institute (IFT) of Mexico made a one-character change in document URLs that will need updating. Hundreds of Mexican radio articles cite its technical and other authorizations.

They added a "v" to the URL, so URLs that were formerly

changed to

Is this possible to have done as a bot task? The articles that need it are mostly in Category:Radio stations in Mexico or Category:Television stations in Mexico. Raymie (tc) 20:10, 4 February 2020 (UTC)

Huh. I was certain there is some kind of general bot for this kind of link replacement operation... Jo-Jo Eumerus (talk) 17:35, 7 February 2020 (UTC)
Jo-Jo Eumerus, I think GreenC (talk · contribs) ends up doing most of them at WP:URLREQ --AntiCompositeNumber (talk) 20:34, 7 February 2020 (UTC)

@Raymie: In addition they now serve http only, but left no http->http redirect, and most all of the links on WP are http. This should be done by URL-specific bot because of archive URLs and {{dead link}} tags (some may already be marked dead and/or archived that need to be unwound once corrected). Could you post/copy the request to URLREQ, there is a backlog but I will get to it. -- GreenC 20:02, 8 February 2020 (UTC)

Request for change of (soon to be) broken links to LPSN[edit]

Thread moved to Wikipedia:Link_rot/URL_change_requests#Request_for_change_of_(soon_to_be)_broken_links_to_LPSN and poster notified. -- GreenC 03:26, 14 February 2020 (UTC)

Bot needed to tell Wikiprojects about open queries on articles tagged for that WikiProject[edit]

I sometimes put queries on article talkpages, some get answered quickly, some stick around indefinitely and occasionally old ones get resolved. My suspicion is that my experience is not unusual, but I hope that this is a software issue and that a lot more article queries could be resolved if the relevant editors knew of them. Would it be possible to have a bot produce reports for each Wikiproject of open/new talk page threads that are on pages tagged to that project? ϢereSpielChequers 09:49, 10 February 2020 (UTC)

The main thing is how do those 'queries' get detected? What constitutes 'queries'? Headbomb {t · c · p · b} 12:44, 10 February 2020 (UTC)
One way that might work, but would throw a lot of false positives, would be to notify the project(s) if there is a post that has no reply after a week. Another option would be to have some form of template like {{SPER}} that could summon the bot if a user wanted more input. Primefac (talk) 12:50, 10 February 2020 (UTC)
@Headbomb I was assuming a new query would be any new section on the talkpage of an article tagged for that wikiproject, excluding any tagged as {{resolved}}. @Primefac I'm not sure of the false positives, other than on multi tagged articles. If that did get to be an issue it might be necessary to have people going through such a report the option to mark a section as not relevant to their wikiproject. So an article about a mountain might be tagged under vulcanism, climbing, skiing and still get a query as to the gods that some religion believes live on it. But I suspect that thhe false positives will nnot be a huge issue. ϢereSpielChequers 16:16, 10 February 2020 (UTC)
"any new section on the talkpage of an article tagged for that wikiproject, excluding any tagged as {{resolved}}" given that most sections on talk pages don't need to be marked as {{resolved}} to begin with, I can't see this idea/criteria getting consensus. The signal-to-noise ratio would be ludicrously small. Taking Talk:Clara Schumann from a few sections above as an example, that would be 39 'queries' for that article alone. Headbomb {t · c · p · b} 00:50, 11 February 2020 (UTC)
Clearly that article is not typical. But the most recent thread is from January this year, the previous one from last October, so a report of any new section would include it now provided new was interpreted as broadly as thirty days. If we only went back 7 days it would already have dropped off the report. In the unlikely event of needing to make the report shorter, if someone has a tool for identifying signatures it could list single participant threads. ϢereSpielChequers 07:17, 11 February 2020 (UTC)
I would certainly veto such an idea. These would likely be spam levels of updates, and duplications for those that would watch the updater and also the article itself. I would assume most would unwatch the updated list of "queries" pretty quickly, which would be pointless. A better solution is to post on the wikiproject talk page if a post doesn't get enough attention.
There are also a LOT of inactive/semi active Wikiprojects that would get a lot of bot updates, for no one to read. Seems like a lot of work and edits when we could simply post something on the wikiproject talk page to gain additional input. Best Wishes, Lee Vilenski (talkcontribs) 08:45, 11 February 2020 (UTC)
Yes lots of wikiprojects are inactive, perhaps some will be revived by having this report, others will be unchanged. The report would be a success if an increased proportion of talkpage queries get a response, 100% response rate would be nice, but this report aims to reduce a problem not to totally resolve it. As for posting things on WikiProject talkpages, that is reasonable advice to the regulars, not something we expect newbies to do, and in case it wasn't obvious, it is unnoticed queries by newbies that I worry most about. ϢereSpielChequers 09:15, 11 February 2020 (UTC)
This can be done without a bot with RecentChangesLinked where you set it to "Show pages linking to". --Izno (talk) 16:25, 11 February 2020 (UTC)
That just gives you an indication that there has been a change to a page not that there is a query that needs to be responded to on that page. Keith D (talk) 00:03, 12 February 2020 (UTC)
  • If I'm reading the intent correctly, I think this can be resolved by, alternatively, (1) using WikiProject banners to encourage editors to ask the question there instead, (2) relying on WP:Article alerts to list all WP:RFCs of consequence in the project, or (3) adding some tag with lower stakes than an RFC (e.g., a variant of {{help me}}) and submitting a feature request for WP:Article alerts to track that template/tag. On the whole, could use more evidence that this is an actual problem. Agreed that it would be a lot of noise to create a listing for every new, unreplied talk page section on every project page, especially when such sections do not necessarily require responses (e.g., "FYI" messages). czar 01:57, 17 February 2020 (UTC)

Copy coordinates from lists to articles[edit]

Virtually every one of the 3000-ish places listed in the 132 sub-lists of National Register of Historic Places listings in Virginia has an article, and with very few exceptions, both lists and articles have coordinates for every place, but the source database has lots of errors, so I've gone through all the lists and manually corrected the coords. As a result, the lists are a lot more accurate, but because I haven't had time to fix the articles, tons of them (probably over 2000) now have coordinates that differ between article and list. For example, the article about the John Miley Maphis House says that its location is 38°50′20″N 78°35′55″W / 38.83889°N 78.59861°W / 38.83889; -78.59861, but the manually corrected coords on the list are 38°50′21″N 78°35′52″W / 38.83917°N 78.59778°W / 38.83917; -78.59778. Like most of the affected places, the Maphis House has coords that differ only a small bit, but (1) ideally there should be no difference at all, and (2) some places have big differences, and either we should fix everything, or we'll have to have a rather pointless discussion of which errors are too little to fix.

Therefore, I'm looking for someone to write a bot to copy coords from each place's NRHP list to the coordinates section of {{infobox NRHP}} in each place's article. A few points to consider:

  • Some places span county lines (e.g. bridges over border streams), and in many of these cases, each list has separate coordinates to ensure that the marked location is in that list's county. For an extreme example, Skyline Drive, a long scenic road, is in eight counties, and all eight lists have different coordinates. The bot should ignore anything on the duplicates list; this is included in citation #4 of National Register of Historic Places listings in Virginia, but I can supply a raw list to save you the effort of distilling a list of sites to ignore.
  • Some places have no coordinates in either the list or the article (mostly archaeological sites for which location information is restricted), and the bot should ignore those articles.
  • Some places have coordinates only in the list or only in the article's {{Infobox NRHP}} (for a variety of reasons), but not in both. Instead of replacing information with blanks or blanks with information, the bot should log these articles for human review.
  • Some places might not have {{infobox NRHP}}, or in some cases (e.g. Newport News Middle Ground Light) it's embedded in another infobox, and the other infobox has the coordinates. If {{infobox NRHP}} is missing, the bot should log these articles for human review, while embedded-and-coordinates-elsewhere is covered by the previous bullet.
  • I don't know if this is the case in Virginia, but in some states we have a few pages that cover more than one NRHP-listed place (e.g. Zaleski Mound Group in Ohio, which covers three articles); if the bot produced a list of all the pages it edits, a human could go through the list, find any entries with multiple appearances, and check them for fixes.
  • Finally, if a list entry has no article at all, don't bother logging it. We can use WP:NRHPPROGRESS to find what lists have redlinked entries.

I've copied this request from an archive three years ago; an off-topic discussion happened, but no bot operators offered any opinions. Neither then nor now has any discussion has yet been conducted for this idea; it's just something I've thought of. I've come here basically just to see if someone's willing to try this route, and if someone says "I think I can help", I'll start the discussion at WT:NRHP and be able to say that someone's happy to help us. Of course, I wouldn't ask you actually to do any coding or other work until after consensus is reached at WT:NRHP. Nyttend (talk) 15:53, 12 February 2020 (UTC)

You could use {{Template parameter value}} to pull the coordinate values out of the {{NRHP row}} template. It would still likely take a bot to do the swap but it would mean less updating in the future. Of course, if the values are 100% accurate on the lists then I suppose it wouldn't be necessary. Primefac (talk) 16:55, 12 February 2020 (UTC)
Never heard of that template before. It sounds like an Excel =whatever function, e.g. in cell L4 you type =B4 so that L4 displays whatever's in B4; is that right? If so, I don't think it would be useful unless it were immediately followed by whatever's analogous to Excel's "Paste Values". Is that what you mean by having a bot doing the swap? Since there are 3000+ entries, I'm sure there are a few errors somewhere, but I trust they're over 99% accurate. Nyttend (talk) 02:57, 13 February 2020 (UTC)
That's a reasonable analogy, actually. Check out the source of Normani#Awards_and_nominations: it pulls the wins and nominations values from the infobox at the "list of awards", which means the main article doesn't need to be updated every time the list is changed.
As far as what the bot would do, it would take one value of {{coord}} and replace it with a call to {{Template parameter value}}, pointing in the direction of the "more accurate" data. If the data is changed in the future, it would mean not having to update both pages.
Now, if the data you've compiled is (more or less) accurate and of the not-likely-to-change variety (I guess I wouldn't expect a monument to move locations) then this is a silly suggestion – since there wouldn't be a need for automatic syncing – and we might as well just have a bot do some copy/pasting. Primefac (talk) 21:27, 14 February 2020 (UTC)
Y'know, this sort of situation is exactly what Wikidata is designed for... --AntiCompositeNumber (talk) 22:29, 14 February 2020 (UTC)
Primefac, thank you for the explanation. The idea sounds wonderful for situations like the list of awards, but yes these are rather accurate and unlikely to change (imagine someone picking up File:Berry Hill near Orange.jpg and moving it off site), so the bot copy/paste job is probably best. Nyttend (talk) 02:23, 15 February 2020 (UTC)
By the way, Primefac, are you a bot operator, or did you simply come here to offer useful input as a third party? Nyttend (talk) 03:12, 20 February 2020 (UTC)
I am both botop and BAG, but I would not be offering to take up this task as it currently stands. Primefac (talk) 11:24, 20 February 2020 (UTC)
Thank you for helping me understand. "as it currently stands" Is there something wrong with it, i.e. if changes were made you'd be offering, or do you simply mean that you have other interests (WP:VOLUNTEER) and don't feel like getting involved in this one? This question might sound like I'm being petty; I'm writing with a smile and not trying to complain at all. Nyttend (talk) 00:27, 21 February 2020 (UTC)
I came here to say what AntiCompositeNumber said. It's worth emphasising: this is exactly what Wikidata is designed for. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:25, 17 March 2020 (UTC)

Harvard Bot[edit]

I've been using User:Ucucha/HarvErrors.js for a few days now, and it's a pretty nice little script. However, the issues it highlights should be flagged for everyone to see and become part of regular cleanup. For example, in Music of India, two {{harv}}-family templates are used to generate reference to anchors, designed to point to a full citation.

However, inspecting the page reveals those anchors aren't found anywhere on the page. Even a manual search won't find the corresponding citations on that page, because this isn't an issue of someone having forgotten a |ref=harv in a citation template, they just aren't there to begin with.

A bot should flag those problems, probably with a new template {{broken footnote}}, or possibly on the talk page.

Headbomb {t · c · p · b} 15:13, 21 February 2020 (UTC)

Looks like the inline ref problem template {{citation not found}} is designed for this already. --AntiCompositeNumber (talk) 15:18, 21 February 2020 (UTC)
@AntiCompositeNumber: – {{citation not found}} is too general, for citations that are completely missing (see harv problems, #1). This is more specific. The citation could be there, but simply not linked to correctly. The bot shouldn't add a tag to a footnote tagged with {{citation not found}}, though, since that would be almost sure to be redundant. Headbomb {t · c · p · b} 15:35, 21 February 2020 (UTC)
Perhaps a new template such as {{Harv error}} would be needed. Or one might rig the CS1 templates to produce an automatic error message ... Trappist the monk? Jo-Jo Eumerus (talk) 19:36, 21 February 2020 (UTC)
@Jo-Jo Eumerus: well, that's what {{broken footnote}} is. @Trappist the monk: CS1 should really emit |ref=harv automatically though. That would kill a great deal of those errors (although certainly not all). Headbomb {t · c · p · b} 20:12, 21 February 2020 (UTC)
Um, the preceding post was by me, not Trappist. I just pinged them. Jo-Jo Eumerus (talk) 20:50, 21 February 2020 (UTC)
Brainfart, meant to ping Trappist for the second part only. Headbomb {t · c · p · b} 21:01, 21 February 2020 (UTC)
This is a very good idea. Recently came across this in Easter Island ref #113 (Fischer 2008). There is no reference for Fischer 2008. In fact the reference is a faux-Harvard <ref>Fischer 2008: p. 149</ref> Lot of permutations for Harvard reference problems that a specialized bot could become expert on. -- GreenC 20:09, 21 February 2020 (UTC)
Since there appears to be interest in this, Coding... No real preference about what template should be applied. --AntiCompositeNumber (talk) 22:50, 21 February 2020 (UTC)
BRFA filed --AntiCompositeNumber (talk) 20:28, 23 February 2020 (UTC)
Tweaking the harvard templates to show the User:Ucucha/HarvErrors.js by default, and maybe add a maintenance category, would be much preferable. The display would be instant and it would save us from lot of spammy templates and bot edits. This is also how the CS1/2 templates do it: i.e. Help:CS1 errors#Missing or empty |title= instead of a bot going around tagging these with {{title missing}}. Is there any technical reason why this couldn't be done? – Finnusertop (talkcontribs) 21:05, 23 February 2020 (UTC)
Finnusertop, Yes, that is not possible inside the template. To be able to determine if the link works or not, the template would need access to the rendered page HTML. Since the rendered page HTML is only available after the template itself is parsed and rendered, the template can't look at it. A MediaWiki extension could hook itself into the parsing chain, but changes in the parsing system make that a bad idea, especially in the next year or so. --AntiCompositeNumber (talk) 21:25, 23 February 2020 (UTC)
I see, AntiCompositeNumber. Then how about making User:Ucucha/HarvErrors.js a default gadget (or will the parser negatively affect that too)? – Finnusertop (talkcontribs) 21:30, 23 February 2020 (UTC)
@Finnusertop: That would be possible, but it would require someone to maintain it. Consensus would probably be required to have it default-on as well. Pages that have broken citations that have less editing traffic would also be unnoticeable, since the script wouldn't be able to apply tracking categories. This sort of problem is more directly comparable to dead links than missing parameters. --AntiCompositeNumber (talk) 21:38, 23 February 2020 (UTC)
Not only that, but turning that on would also throw pointless warnings (anchors without refs) and fail to populate maintenance categories. Headbomb {t · c · p · b} 21:50, 23 February 2020 (UTC)

Deleting tracking parts of URL from sources etc.[edit]

There are a lot of URL in sources, that have tracking extensions by Facebook attached, they should be deleted. ( I think that would be a fine job for a bot, and as it's probably happening unintentional by some editors, who copy'n'paste this without much thinking, it should probably done once per day or week or so. Same goes probably for Google Analytics extensions with UTM:{} Grüße vom Sänger ♫ (talk) 15:02, 22 February 2020 (UTC)

@AManWithNoPlan and Sänger: Probably a good idea to at least offload some of that to User:Citation bot. Headbomb {t · c · p · b} 16:06, 22 February 2020 (UTC)
This can be prone to breaking archive URLs and creating link rot if one is not careful. See WP:WEBARCHIVES for a list of the archives used on Enwiki and the formats they use. The regex at Wikipedia:Bots/Requests_for_approval/DemonDays64_Bot_2 is an example, it uses lookback to avoid URLs that are embedded in an archive URL, User:DemonDays64 could probably help explain it. The other problem is that if you retain the tracking bits in the archive URL but remove it from the source |url= they are now mismatched and look like different URLs, other bots might pick up on that and restore the archive URL version of the source URL, since it is the authority (once the link is dead). Personally, I would bypass any citation that involves an archive URL too many complications. -- GreenC 16:23, 22 February 2020 (UTC)
Tracking bits are evil, they must go away. If web archive used this evil URL in the past, that's something we have to live with, better link rot then supplying facebook with anything. Grüße vom Sänger ♫ (talk) 16:27, 22 February 2020 (UTC)
@Sänger: not challenging the idea (it'd be great to clean up links if there weren't side effects) but think about this: we'd be hurting Facebook by leaving them; it gives them bad data every time someone clicks one that isn't actually in the place it was supposed to be. Still would be a good idea if only the archive bots would reliably understand. DemonDays64 (talk) 17:48, 22 February 2020 (UTC)
I think KolbertBot 4 operated by Jon Kolbert has approval for this. ‑‑Trialpears (talk) 22:48, 22 February 2020 (UTC)
If it can be done reliably without altering the generated contents, damaging links and causing linkrot, I would applaud any efforts to remove these tracking/click identifiers like utm_source, utm_medium, utm_campaign, utm_term, utm_content, gclid, gclsrc, dclid, fbclid, zanpid (see UTM parameters). Similar things could be done for some other known types of links as well, for example Google Books links as used in many citations often contain all kinds of irrelevant parameters and could, in most cases, be reduced and normalized to just the id parameter identifying the book and some page information. If this causes problems in associating archived links, we should try to work with the archivers so they improve their matching algorithms - given our good relations with them I guess this already happens at least in the case of We might also think about adding a module to the framework of citation templates containing a ruleset for a number of known sites which would at least highlight links containing unnecessary parameters in edit preview so the parameters can be manually removed by (knowledgeable) editors even before archives are created. --Matthiaspaul (talk) 11:48, 17 March 2020 (UTC)
  • I have a bot task that does this. I haven't run it in a while though. Primefac (talk) 14:43, 20 March 2020 (UTC)

Create WT: redirects according to WP: shortcuts[edit]

Would it be controversial to request a bot to create redirects in the Wikipedia talk namespace to the talk pages of the targets of redirects in the Wikipedia namespace? I've typed WT:xxx, expecting it's a shortcut given WP:xxx is, only to be disappointed it doesn't exist from time to time. Nardog (talk) 01:26, 24 February 2020 (UTC)

Not really. It'd create a lot of pointless ones for mostly unused redirects, but it's not like anyone will care. Should only cover those explicitly marked as {{R from shortcut}} though. Headbomb {t · c · p · b} 01:49, 24 February 2020 (UTC)
Should only... Why? It's not like WP: shortcuts technically exist in the main namespace, as in H:. I'd like e.g. WT:Actors to work, even though WP:Actors isn't marked as a shortcut. (I can see an argument for avoiding shortcuts to sections, though.) Nardog (talk) 03:23, 24 February 2020 (UTC)
@Nardog and Headbomb: There's an order of magnitude fewer tagged WP redirects without a talk page than all WP redirects without a talk page. There's definitely an argument to be made that the tagged redirects are generally more useful or more well-known than untagged redirects, and there is definitely a lot of chaff in the all redirects query. --AntiCompositeNumber (talk) 04:04, 24 February 2020 (UTC)

Maintenance tags for questionable sources[edit]

When using footnoted referencing, the task of assessing what source supports what text is complicated. A reference may be tagged e.g. {{self-published source}}, {{self-published inline}}, {{deprecated inline}}, {{dubious}} and other tags which may be applied to the footnoted reference but these are not linked to the readable content. When using <ref> tags, by contrast, we can use <nowiki><ref>{{cite [...] | publisher=$VANITYPRESS [...] {{self-published source}}</ref>{{self-published inline}} to flag both the reference and the inline citation.

I would like a maintenance tag bot to add, e.g., {{self-published inline}} after the {{sfn}}/{{harv}} instances matching footnoted citations that are flagged as self-published, deprecated or otherwise dubious. Guy (help!) 09:15, 25 February 2020 (UTC)

That's really too much of a WP:CONTEXTBOT here. For example, WordPress is a venue for a lot of self-published things, but that doesn't mean it's necessarily wrong to cite them (WP:RSCONTEXT), so applying {{self-published source}} to some of those would be flagging a problem that isn't one.
A WP:CITEWATCH/WP:UPSD-like solution really is the best thing here. The CiteWatch only looks for |journal=, but a similar bot could be coded to look for domains found in |url= and |publisher/website/magazine/journal/work/...= Headbomb {t · c · p · b} 22:22, 25 February 2020 (UTC)

Address implicit/structural composition gender bias by bot[edit]

While it is likely impossible to automate all of these guidelines Wikipedia:Writing_about_women, things like using last name or relationships in lede are systematic bias which can have systematic solutions. A bot attempting to do this would be AMAZING (where exceptions like Icelandic folks would be an opt out rather than opt in) — Preceding unsigned comment added by Icy13 (talkcontribs) 21:39, 25 February 2020 (UTC)

Icy13, interesting idea, but I think it's way too much of a WP:CONTEXTBOT problem. The bot would need to be able to do the following:
  • Identify the article as a biography of a female - not necessarily an easy task for a bot! Gender may or may not be mentioned in categories, isn't in the infobox and just checking for words like "she" and "her" wouldn't be sufficient.
  • Identify the subject's given and family names - again, not necessarily an easy task, given the number of name formatting styles. I don't just speak of patronymic names like Icelandic, but surname-first family names, Spanish names which contain surnames from both parents, mononymous people, probably several more cases I haven't thought of.
  • Recognize problematic sentences like those you suggested.
In short, I think it's much too context-sensitive to be feasible for a bot to reliably identify, much less correct, the rules you linked. creffpublic a creffett franchise (talk to the boss) 21:57, 25 February 2020 (UTC)
If there's such a bot coded, it would probably have to be the kind that creates a report on a centralized page, rather than one that edits anything in the article or its talk page. Headbomb {t · c · p · b} 22:17, 25 February 2020 (UTC)

A bot that categorizes (possibly tags) pages with embedded images and files that don't have alt text.[edit]

Basically the title. There are numerous articles with images (and other content) that should have alt text but do not. MOS:ALT says that we should try to ensure that images have alt text for accessibility reasons, which is especially important for people that utilize screen readers that cannot physically see the images. In a nutshell, said bot would probably check articles that have an embedded file such as a video, music, or image. It would then add the article a maintenance category on whether or not the embed has alt-text, as well as possibly a tag to the article letting readers (including people with screen readers) that alt-text is missing.
A related idea would be the same as the above but for math markup, which should probably be tagged/categorized separately due to the technical knowledge required to translate it into English. Chess (talk) Ping when replying 01:52, 5 March 2020 (UTC)

I've seen past discussions about ALT text and one thing that came up is that it's not always appropriate to have ALT text for an image and that ALT text is often hard to write. I am thus not sure we want to have a general maintenance tag for missing ALT text. Jo-Jo Eumerus (talk) 08:37, 5 March 2020 (UTC)
I agree with Jo-Jo about how hard writing one seems to be. I have seen everything from a repeat of the caption to a detailed description of the picture that mentions everything except who or what is in the image. If it it does proceed someone will want to write a guideline page giving detailed instructions about how to write one. MarnetteD|Talk 08:42, 5 March 2020 (UTC)
I should add that it is only used occasionally so the number of articles has gotta be immense. MarnetteD|Talk 08:44, 5 March 2020 (UTC)
Yes, if images are tagged as needing alt text, there is the likelihood that somebody will simply copy the caption to |alt= and feel that they are justified in removing the tag. No |alt= parameter is better than having a repeat of the caption. --Redrose64 🌹 (talk) 14:52, 5 March 2020 (UTC)
Is it possible to at least get alt text tagging for math markup? As of now many of these equations are inaccessible. Chess (talk) Ping when replying 16:59, 5 March 2020 (UTC)
A bot and tag does not seem suitable to this problem. As it is today, the image alt tag is I believe filled by the LaTeX, which is a reasonable alt text. I would recommend improvements to MediaWiki core and the Math extension to emit a tracking category or Linter error instead. --Izno (talk) 00:23, 7 March 2020 (UTC)
@Izno: The LaTeX can be questionable or even incomprehensible in many cases especially to someone not intimately familiar with the markup. For example, \and and \or are deprecated in favour of \land and \lor, which obviously can cause problems with screenreaders. Help:Latex has a lot of examples and if you look at some of the LaTeX source for them you can see how it might be incomprehensible for a screen reader. Formatting instructions would presumably also be a pain to hear especially if there's a lot of them.
Anyways one of the main reasons for me requesting this is that I'd like to start adding some math markup alt-text myself. If it's not possible to get a bot to categorize, is there another potential way I could find a list of LaTeX equations in articles? I'm not good with coding so I'd love it if there was a way possibly with Regex or something. And if I were to do this is there some place I'd need to seek consensus before doing so? Also is there anywhere I could get a good opinion on how transcriptions of math should work? Chess (talk) Ping when replying 00:29, 9 March 2020 (UTC)
@Chess: I said reasonable, not any other word that would indicate that all was right in the world. :) I have no doubt the LaTeX can be incomprehensible at times.
This won't tell you which ones have alt text and which don't, but this search is as good as any. I suspect most of those pages need them, so edit to your heart's content. If you think you might need consensus, you should ask at WT:MATH (your question is reasonable but I don't know better than you do if others will be disappointed by your changes).
As I said, I think it would be a good idea to change one of a couple of extensions to emit categories or Linter errors instead, if alt text is desirable generally. Help:Bug reports is the place to start for that. --Izno (talk) 00:40, 9 March 2020 (UTC)
Thinking about it some more, alt text for math that would avoid ambiguity would require us to standardize some method of converting mathematical expressions to words in an unambiguous and consistent way. Such a formal language would be an unprecedented undertaking that would probably require significant expertise far beyond what I have. Especially considering the reason why math has switched to symbols for expression is that wordy statements can be impossible to parse. Chess (talk) Ping when replying 22:51, 10 March 2020 (UTC)

Converting inline interlanguage links to use Template:Interlanguage link?[edit]

It's not uncommon that inexperienced editors will add piped inline interlanguage links to articles that exist on a different Wikipedia in order to avoid red links. This is a contravention of the MOS, as it surprises the reader, and prevents links to valid articles once they are created. Such piped links should be replaced with the {{Interlanguage link}} template, e.g. Special:Diff/866719019.

Is this something that could feasibly be done by a bot? Are there valid intentional uses that shouldn't be changed? (I guess it's clearer with languages that use non-Latin script, since I can't think of a good reason to pipe a foreign name under English text, but I'm not sure about those which use the Latin alphabet.) --Paul_012 (talk) 02:48, 13 March 2020 (UTC)

There's an idea to explore here, but there should be a fair amount of testing on this. I feel oftentimes those interwiki links should be replaced with an enwiki link (as in we have an article, but someone used another language for some reason). Also, probably should only affect mainspace/draft space. Headbomb {t · c · p · b} 03:55, 13 March 2020 (UTC)
Definitely something to explore, though I wonder about the scope; is this a "hundred-edit cleanup" project, or is this a "hundred-edit-per-day cleanup" project? I notice that WP:WCW tracks (via #45, #51, #53, and #91) most of this type of issue. Only about 400 hits on the first three. The only one I would be concerned about is #91, because I glanced at a few and it looks like people are trying to use the other-language wikis as references; converting those to {{ill}} might be problematic as they would be harder to track and thus makes it a CONTEXT issue. Primefac (talk) 15:26, 15 March 2020 (UTC)
I did some preliminary searches and found some three thousand results for Japanese and Chinese, and a hundred or so for Thai. It does appear that there are false positives in the form of deliberate references, though these should be avoidable by excluding citations. I just realised though that there's another issue which may prevent a bot from doing this: the link might be piped to something else that isn't the appropriate English title, e.g. abbreviations and non-disambiguated forms. Fore example, 2020 in Philippine television contains the piped link [[:th:ลมซ่อนรัก (ละครโทรทัศน์)|Hidden Love]], which would need to be converted to {{ill|Hidden Love (TV series)|lt=Hidden Love|th|ลมซ่อนรัก (ละครโทรทัศน์)}}. --Paul_012 (talk) 18:58, 16 March 2020 (UTC)

Follow up task for files tagged Shadows Commons by GreenC bot job 10[edit]

GreenC bot by @GreenC: has a job that detects when a file on Wikipedia has the same name as one on Commons but is a different image, and tags the local file with Template:Shadows Commons, which puts it in Category:Wikipedia files that shadow a file on Wikimedia Commons.

I've been processing the files in that category, and many of the files on Commons are copyright violations, which are deleted within hours/days of upload. It would be useful for a bot to review the files tagged with Template:Shadows Commons and remove that template if there is no longer a file on Commons with the same name.

At any given time there are only a small number of files in that category, 30 or so, so this could potentially be done more than once a day without being very resource intensive, though once a day would be plenty useful. The Squirrel Conspiracy (talk) 06:43, 21 March 2020 (UTC)

Shadowbot the bot that adds the tags, runs daily at 4:37 GMT -- GreenC 11:38, 21 March 2020 (UTC)

Updating essay impact assessments[edit]

Per this conversation, the automated essay assessment system has fallen badly out of date since BernsteinBot stopped updating it in 2012. It would be useful to revive it so that essay readers could have a better indication as to whether the essay they are reading is more likely to represent a widespread norm or just a minority viewpoint. MZMcBride has provided the original code, but it will need to be updated. Your help would be much appreciated. Regards, Sdkb (talk) 20:19, 22 March 2020 (UTC)

Convert Non-free reduce to Non-free manual reduce for images that are not .jpg or .png[edit]

Greetings. I'm here once again to bother you all about Files!

Tagging a file {{Non-free reduce}} places it in Category:Wikipedia non-free file size reduction requests, where User:DatBot performs the file size reduction automatically if the file is in .png or .jpg format. However, DatBot doesn't process any other format, and therefore files in other formats need manual processing.

I am requesting a bot to, once daily, check all files in Category:Wikipedia non-free file size reduction requests and, if the file format is not .png or .jpg, change {{Non-free reduce}} into {{Non-free manual reduce}}, so that they're more readily processed.

Thanks! The Squirrel Conspiracy (talk) 02:31, 23 March 2020 (UTC)

No need for a bot. I've updated the category sorting in {{non-free reduce}} to default to manual except for png, jpg, and svg. — JJMC89(T·C) 02:59, 23 March 2020 (UTC)
You are a dark and powerful warlock and I fear to fall afoul of your mastery of the code. Excellent. Looks good to me. Thanks! The Squirrel Conspiracy (talk) 04:21, 23 March 2020 (UTC)

Y Done

Wikipedia Edition Article Similarity Bot[edit]


I have a working bot; its purpose is to give readers and editors alike information regarding the presence of different content in other Wikipedia language editions for the same article they are reading. This information can be used to guide the reader to content which will add to their study, and/or highlight that content in another language happens to be biased. I hope that such a bot would be used to ratchet up the level of discourse across language editions and spread useful knowledge between them.

My proposal is that the bot be allowed to add a small phrase to the 'See Also' section of a given article, such as, "The Russian edition of this article is 70% different from this edition. You can view it here."

As I was working on this bot, there was an ongoing discussion at the Idea Lab. You can view it at Wikipedia Edition Article Similarity Bot.

I assert that the bot works: its most limiting factor right now is that I only have access to 2 million characters of translation capability per month for article comparisons, which limits the bot to a relative handful of articles in output per month. You can see the code here.

This is not a bot request--it is a request for the bot to have edit capabilities. If there is a more appropriate place for this request, please let me know.

Theory42 (talk) 16:16, 26 March 2020 (UTC)

I think you need WP:BRFA, though one of the requirements is that you show you have consensus to perform the edits you want. I don't see that in the Village Pump discussion you've linked to. Spike 'em (talk) 16:56, 26 March 2020 (UTC)

Articles needing an infobox backlog reducer[edit]

There are many articles (or at least enough that it would be tedious to go through and check each and every one) in the backlog ( that actually do have infoboxes. I think a bot that could be submitted a category of 200-500 pages, read the wikitext of each one, and if the page has {{infobox in it, go to the talk page of that article and remove the needs-infobox=yes parameter Firestarforever (talk) 13:39, 28 March 2020 (UTC)

I am pretty sure there was a recent bot request and even BRFA to do that. --Izno (talk) 17:55, 28 March 2020 (UTC)
Wait, really? Huh. I guess I just needed to look around some more. ThanksFirestarforever (talk) 19:36, 28 March 2020 (UTC)
Found one Wikipedia:Bots/Requests_for_approval/PearBOT_2. My request can be safely ignored Firestarforever (talk) 11:57, 30 March 2020 (UTC)