Scripted Re-Mark - Help Page

On this page, you'll find help for the batch-mode del.icio.us bookmark editor, Scripted Re-Mark. You can see some simple examples, get a handle on the Tag Tidy function and "touched tags", understand regular expressions, perform a mass delete or get some tips and tricks.

Simple Examples

At it's simplest, you can use Scripted Re-Mark to perform a "find and replace" on your bookmarks, much as you would in a text editor or word processor. This can be applied to any field (title/description, URL, notes/extended, tags). For example, you could correct the spelling of "labor" in your titles using this rule:

FieldSearchReplaceCaseMatchMode

Note that this rule will only apply to the title field - it will leave the notes, URLs and tags alone. Also, it will apply to Labor, lAboR and laboR, since we're ignoring case. Lastly, it will apply to all instances of "labor" in the title. If we wanted to just change the first, we'd set Match to "First".

Next, what about deleting unwanted text? That's easy; just replace it with the empty string. Think a phrase is over-used? You may wish to remove all references to "web 2.0" in the titles:

You can apply these rules to your URLs as well. Suppose you're your favourite blog moves to a new domain. Here's a rule to update the bookmarks:

You can also use this tool to delete, split, merge or rename tags across selected bookmarks. If you wanted to split your tag "funny-story" into "funny" and "story", you could do this:

Pretty easy, huh? There's no end to the possibilities - especially when you unleash the power of regular expressions ...

Tag Stemming

The Tag Tidy button will apply stemming to your tags. Loosely speaking, this means grouping similar tags, where they only differ in their ending. For example, the tags fish, fishing, fisher and fished all share the same word stem and will be grouped. Scripted Re-Mark uses Porter stemming.

When merging lexically similar tags, you can choose whether to replace the words with the most frequent tag or the shortest tag. Suppose you have tagged 8 URLs with fish, 15 URLs with fishing and 2 with fished. Using "frequency", all references to fish and fished will be replaced with fishing, since it's the tag that occurs most frequently (15 times). Using the "brevity" option, fish would replace all others, since it's the shortest (four letters).

NB: You can also use these generated rules to manually merge tags on the delicious tag edit/rename page. Thanks to Doug Pfeffer for encouragement, and of course, Martin Porter for his stemming code!

Regular Expressions

The rules that Scripted Re-Mark builds and applies to your bookmarks are, in fact, regular expressions. These are used by programmers to manipulate text according to patterns. They are a very flexible and powerful way of doing so, and are common to a number of languages. (We're using JavaScript's RegExp object here.) You can read up on the history and theory of regular expressions, but we recommend a more hands-on introduction. You can play with an interactive regexp tester. There's also a comprehensive how-to and a quick reference, if you just need to brush up on the syntax.

Maybe some more advanced examples will help. We can use regular expressions to match more complex cases. For example, suppose we wanted to replace either "practice" or "practise" with "praxis". Sure, we could make two rules - or we could do this:

The [cs] says "match either c or s". We can also use special characters that stand for whole groups: \d means "any digit", \w means "any word character", \s means "any space" and . (period) means, well, anything. ^ means "start of the string" and $ means "end of the string". Further, we can set it to match multiple occurrences using the + (one or more) and * (zero or more) operators. For example, \w+\d* says "match against one or more letters followed by zero or more digits". Suppose we wanted to remove any 9-digit serial numbers in the notes field, replacing it with xxxxxxxxx:

Note that the {9} is a specific quantifier; it looks for exactly nine occurrences of whatever proceeded it (in this case, a digit). OK, one last trick: capturing parentheses. You can wrap ( ... ) around parts of your regular expression and the text it matches can be used again in the replace string! The first time you do this, the result is stored as $1. Subsequent captures are $2, $3 and so on. Here's how you might reverse the name of authors in the notes field (eg from "John Smith" to "Name: Smith, John"):

The regular expression says "match one or more word characters, followed by a space, followed one or more word characters". The replace string says "replace with Name: followed by a space, the result of the second capture, a comma and space and the result of the first capture".

Just a couple of notes on using regexps ... if you want to use punctuation, you'll probably need to preface it with a backslash. Eg to match a period or a dollar sign or a backslash, use \. and \$ and \\. Here's one to convert swap the year and month directories (yyyy/mm becomes mm/yyyy):

As you can see, you are only limited by your imagination and ingenuity when it comes to transforming your bookmarks with regular expressions.

Got more examples using regular expressions? We'd love to hear from you!

Using Functions

Due to popular demand, Scriped Re-Mark now supports passing functions as parameters! What does this mean? Well, perhaps a quick example suffices. Suppose you want to convert all your titles to upper case. Sure, you could use 26 rules that map a -> A, b -> B and so on. Or, you could use JavaScript's built-in toUpperCase() function:

There's two things going on here: first, we've changed the mode from "String" to "Function". This lets the script know that you're not talking about a string called "function". (Hey, chances are someone's tagged a URL with that word). Secondly, the Replace parameter is now JavaScript code; specifically, an anonymous function. JavaScript will run this code on any matches it finds, in the case turning lowercase word-letters into uppercase ones. You can use any built-in functions you like, or even define your own. If you're going to push the envelope, it's worth reading some more on using functions in regular expressions.

Here's another contrived example. Suppose you wanted to prepend your notes with a count of the number of characters. Try this:

As you can imagine, with functions you're only limited by your imagination. If you come up with a doozy, please share it for others to use on Freshblog!

Touched Tag

This is handy for multi-stage processing (like filter/delete) and reviewing automatic edits. It attaches a designated tag to all (and only all) bookmarks that have been modified by or match your rules. For example, you could add the tag "google_query" to all bookmarked queries on Google by setting the "touched" value to google_query and apply this rule:

Note that this leaves the URLs as is, but since there's a match it adds the touched tag. Suppose you further wanted to prefix the title of all bookmarks with (Query). Follow the steps above to apply google_query as your touched tag. Apply the rules to your bookmarks in the usual fashion. Then, return to the Scripted Re-Mark page and clear all the rules. Next, create a new rule:

When you apply this rule to your bookmarks, make sure you select the google_query tag on your delicious page. This will mean only those bookmarks that were marked as a query will have this new rule applied.

Deleting Bookmarks

This powerful - and dangerous - feature allows you to mass delete bookmarks from del.icio.us. Please be warned that they are gone for ever. You can delete bookmarks by tag, URL, title, description, or sharing setting. The deletion code will remove all bookmarks visible on your del.icio.us page, so you must first filter the bookmarks by selecting a tag. To do this, open a new window with the requisite URL eg http://del.icio.us/joshua/nyc?setcount=50 would select Joshua's most-recent 50 bookmarks tagged with "nyc". Proceed with caution to use the deletion code found at the bottom of Scripted Re-Mark.

If your unwanted bookmarks don't share a common tag, you must first use the "touched tag" feature to give them one. Suppose you wish to delete all your MySpace bookmarks. Simply create a rule to match them all, ensure the "touched tag" option is set (checked) and provide a tag name like "to-delete":

(Note that we left the replace string blank, removing the URL. But hey - it's about to get deleted, so what do we care?) Go ahead and apply this rule in the usual fashion. All your MySpace bookmarks will now be tagged with "to-delete". Simply open your del.icio.us page with the posts marked for destruction eg http://del.icio.us/joshua/to-delete?setcount=100 and use the deletion code.

If you want to delete all your public (or private) bookmarks, you can get the "to-delete" tag afixed as follows. Leave the search/replace rules blank. Select Bookmark Sharing to "set private" ("set public"). Make sure "touched tag" is selected (checked) and use a tag name like "to-delete". Apply the rule in the usual fashion. All your public (private) bookmarks will now be private (public), but with the additional tag of to-delete. These can be removed as described above.

Please note that there is great potential for harm when using the delete code. You can't get those deleted bookmarks back. Neither can I, and I doubt del.icio.us can either. Please backup your bookmarks first, ensure you understand what's going on and try it out on a small set (ie 10 posts) first.

Tips and Tricks

This section helps you get the most out of Scripted Re-Marks. The first tip relates to some precautions you can take. The second is how to speed up the process.

Safety First

Like any powerful tool, this can be harmful for the operator. It must be said: this service has the potential to seriously bugger up your bookmarks. Before attempting to apply your ruleset, we recommend the following steps:

  1. Test Your Rules. Use the Test Rules facility to ensure that the rules have the intended effect. Scroll through a number of your bookmarks, checking that you're happy with the results on all of them.
  2. Backup Your Bookmarks. Backing up your bookmarks takes just a few seconds and will potentially avoid disappointment later.
  3. Pilot on a Sample. Rather than launching straight in and possibly mangling a hundred bookmarks at once, set the del.icio.us page to show only ten bookmarks. Apply the rules to them and, if you're happy, proceed with a bigger count.

Hassle-Free Edits

If you are applying update rules to hundreds of bookmarks, it will take quite some time. There are two limiting factors from del.icio.us to contend with: the first is that you can only see 100 bookmarks per page. The second is that this service must introduce a delay of a few seconds to stop del.icio.us throttling you back (sending requests - even manually - too fast will see you locked out for ten minutes or more). Suppose you have 580 bookmarks to update. With a three second delay between each update, that's going to take half an hour! Here's what you can do:

Open six window with a hundred bookmarks on each (1-100, 101-200, ..., 501-580). Once you're happy with your rules (and followed the precautions above), set the delay parameter to 18 seconds (that's six times longer than usual). Then copy and paste the code into each of the six windows with a three second delay between each. It will still take half an hour (with an update every three seconds), but you won't have to wait for each page to finish and set up the next one. Parallel processing means you're attention isn't required, so you can just leave it overnight or get on with your surfing.

Got more tips or tricks? Please send 'em in!


Comments, feedback, troubleshooting, suggestions and discussions to Freshblog, please.


Creative Commons License
This work is licensed under a
Creative Commons Attribution-ShareAlike 2.1 Australia License.

Home


Read and discuss more about Scripted Re-Mark on Freshblog


Bookmarked your own content? Then use your tags to navigate with FreshTags.



Suffering Tagger's Block? Check out Auto-Tagger - automated tag suggestion for del.icio.us.



Swamped by all those social bookmarking buttons? Let PopMarks take care of it for you.



Get Firefox