Love.Law.Robots. by Ang Hou Fu

LegalTech

Feature image

I’ve wanted to pen down my thoughts on the next stage of the evolution of my projects for some time. Here I go!

What’s next after pdpc-decisions?

I had a lot of fun writing pdpc-decisions. It scraped data from the Personal Data Protection Commission’s enforcement decisions web page and produced a table, downloads and text. Now I got my copy of the database! From there, I made visualisations, analyses and fun graphs.

All for free.

The “free” includes the training I got coding in Python and trying out various stages of software development, from writing tests to distributing a package as a module and a docker container.

In the lofty “what’s next” section of the post, I wrote:

The ultimate goal of this code, however leads to my slow-going super-project, which I called zeeker. It’s a database of personal data protection resources in the cloud, and I hope to expand on the source material here to create an even richer database. So this will not be my last post on this topic.

I also believe that this is a code framework which can be used to scrape other types of legal cases like the Supreme Court, the State Court, or even the Strata Titles Board. However, given my interest in using enforcement decisions as a dataset, I started with PDPC first. Nevertheless, someone might find it helpful so if there is an interest, please let me know!

What has happened since then?

For one, personal data protection commission decisions are not interesting enough for me. Since working on that project, the deluge of decisions has trickled as the PDPC appeared to have changed its focus to compliance and other cool techy projects.

Furthermore, there are much more interesting data out there: for example, the PDPC has created many valuable guidelines which are currently unsearchable. As Singapore’s rules and regulations grow in complexity, there’s much hidden beneath the surface. The zeeker project shouldn’t just focus on a narrow area of law or judgements and decisions.

In terms of system architecture, I made two other decisions.

Use more open-source libraries, and code less.

I grew more confident in my coding skills doing pdpc-decisions, but I used a few basic libraries and hacked my way through the data. When I look back at my code, it is unmaintainable. Any change can break the library, and the bog of whacked-up coding made it hard for me to understand what I was doing several months later. Tests, comments and other documentation help, but only if you’re a disciplined person. I’m not that kind of guy.

Besides writing code (which takes time and lots of motivation), I could also “piggyback” on the efforts of others to create a better stack. The stack I’ve decided so far has made coding more pleasant.

There are also other programs I would like to try — for example, I plan to deliver the data through an API, so I don’t need to use Python to code the front end. A Javascript framework like Next.JS would be more effective for developing websites.

Decoupling the project with the programming language also expands the palette of tools I can have. For example, instead of using a low-level Python library like pdfminer to “plumb” a PDF, I could use a self-hosted docker container like parsr to OCR or analyse the PDF and then convert it to text.

It’s about finding the best tool for the job, not depending only on my (mediocre) programming skills to bring results.

There’s, of course, an issue of technical debt (if parsr is not being developed anymore, my project can slow down as well). I think this is not so bad because all the projects I picked are open-source. I would also pick well-documented and popular projects to reduce this risk.

It’s all a pipeline, baby.

The only way the above is possible is a paradigm shift from making one single package of code to thinking about the work as a process. There are discrete parts to a task, and the code is suited for that particular task.

I was inspired to change the way I thought about zeeker when I saw the flow chart for OpenLaw NZ’s Data Pipeline.

OpenLaw NZ’s data pipeline structure looks complicated, but it’s easy to follow for me!

It’s made of several AWS components and services (with some Azure). The steps are small, like receiving an event, sending it to a serverless function, putting the data in an S3 bucket, and then running another serverless function.

The key insight is to avoid building a monolith. I am not committed to building a single program or website. Instead, a project is broken into smaller parts. Each part is only intended to do a small task well. In this instance, zeekerscrapers is only a scraper. It looks at the webpage, takes the information already present on the web page, and saves or downloads the information. It doesn't bother with machine learning, displaying the results or any other complicated processing.

Besides using the right tool for the job, it is also easier to maintain.

The modularity also makes it simple to chop and change for different types of data. For example, you need to OCR a scanned PDF but don’t need to do that for a digital PDF. If the workflow is a pipeline, you can take that task out of the pipeline. Furthermore, some tasks, such as downloading a file, are standard fare. If you have a code you can reuse over several pipelines, you can save much coding time.

On the other hand, I would be relying heavily on cloud infrastructure to accomplish this, which is by no means cheap or straightforward.

Experiments continue

Photo by Alex Kondratiev / Unsplash

I have been quite busy lately, so I have yet to develop this at the pace I would like. For now, I have been converting pdpc-decisions to seeker. It’s been a breeze even though I took so much time.

On the other hand, my leisurely pace also allowed me to think about more significant issues, like what I can generalise and whether I will get bad vibes from this code in the future. Hopefully, the other scrapers can develop at breakneck speed once I complete thinking through the issues.

I have also felt more and more troubled by what to prioritise. Should I write more scrapers? Scrape what? Should I focus on adding more features to existing scrapers (like extracting entities and summarisation etc.)? When should I start writing the front end? When should I start advertising this project?

It’d be great to hear your comments. Meanwhile, keep watching this space!

#zeeker #Programming #PDPC-Decisions #Ideas #CloudComputing #LegalTech #OpenSource #scrapy #SQLModel #spaCy #WebScraping

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

One can have a variety of opinions about the pandemic but I will insist on this one. It made everyone treat online not as a cute sideshow, but as an essential part of working life.

While stuck at home, I made it a point to attend any conference or talk online that seemed adjacent to my interests. I attended talks on machine learning and AI. I even learnt a bit of linguistics.

One of the more life-changing seminars I attended was the first Bucerius Legal Tech Essentials in 2020. In short, I highly recommend it for someone who doesn't have much time but needs to dive deep and swim wide in this field. They lived up to their taglines: “ Curated. Intense. Remote.

You swim wide because they cover a wide gamut of speakers, from academics, thought leaders and entrepreneurs with their own LegalTech companies.

You dive deep mainly because the speakers are talking about their expertise (this isn't a panel show). I recalled that many speakers took questions, so you can engage with them.

The only bad thing was that since all the speakers were based on both sides of the Atlantic, the timing was horrendous for the other side of the world. I remember falling asleep in front of my desk, trying to figure out the Six Sigma rule around 1 in the morning.

Nevertheless, I didn't think I was the only person from South East Asia attending the talks. During the customary roll call of various attendees at the start of each session, you would get a taste of how global interest in LegalTech was.

People in Singapore would also get a taste of Bucerius Legal Tech Essentials when Prof Daniel Katz, one of the “hosts” of Legal Tech Essentials, gave a lecture in 2021 at SMU, Singapore. It was a whirlwind of 500 slides in 60 minutes. Note that there are no certifications or brownie points for attending or interacting. These people stayed up late for the LegalTech.

It seems that being in Singapore has borne other fruit. 2022's Legal Tech Essentials would feature timings more convenient for this part of the world. This means 8:30 pm here... which I reckon is a marked improvement over 1 am.

Legal Tech Essentials 2022Curated, Intense, Remote.You can sign up for updates at their site.

So if you're interested in the field but don't know where to start, I strongly recommend this. I didn't enjoy it as much in 2021 since I found most topics less effective a second time. Maybe I will give this another try.

At the end of 2021, I repeatedly feared that online seminars would be buried and in-person conferences would be back in vogue. I'm glad that Legal Tech Essentials is back and still remote. It was a light in a very dark time of the pandemic, but now I hope it will still light a few light bulbs to anyone interested in Legal and Technology.

#Newsletter #LegalTech #Lawyers #News #tech #TechnologyLaw #Training #Presentation

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

If you’re interested in technology, you will confront this question at some point: should I learn to code?

For many people, including lawyers, coding is something you can ignore without serious consequences. I don’t understand how my microwave oven works, but that will not stop me from using it. Attending that briefing and asking the product guys good questions is probably enough for most lawyers to do your work.

The truth, though, is that life is so much more. In the foreword of the book “Law and Technology in Singapore”, Chief Justice Sundaresh Menon remarked that technology today “permeates, interfaces with, and underpins all aspects of the legal system, and indeed, of society”.

I felt that myself during the pandemic when I had to rely on my familiarity with technology to get work done. Coincidentally, I also implemented my docassemble project at work, using technology to generate contracts 24/7. I needed all my coding skills to whip up the program and provide the cloud infrastructure to run it without supervision. It’s fast, easy to use and avoids many problems associated with do-it-yourself templates. I got my promotion and respect at work.

If you’re convinced that you need to code, the rest of this post contains tips on juggling coding and lawyering. They are based on my personal experiences, so I am also interested in how you’ve done it and any questions you might have.

Tip 1: Have realistic ambitions

Photo by Lucas Clara / Unsplash

Lawyering takes time and experience to master. Passing the bar is the first baby step to a lifetime of learning. PQE is the currency of a lawyer in the job market.

Well, guess what? Coding is very similar too!

There are many options and possibilities — programming languages, tools and methods. Unlike a law school degree, there are free options you can check out, which would give you a good foundation. (Learnpython for Python and W3Schools for the web come to mind.) I got my first break with Udemy, and if you are a Singaporean, you can make use of SkillsFuture Credits to make your online learning free.

Just as becoming a good lawyer is no mean feat, becoming a good coder needs a substantial investment of time and learning. When you are already a lawyer, you may not have enough time in your life to be as good a coder.

I believe the answer is a strong no. Lawyers need to know what is possible, not how to do it. Lawyers will never be as good as real, full-time coders. Why give them another thing to thing the are “special” at. Lawyers need to learn to collaborate with those do code. https://t.co/3EsPbnikzK

— Patrick Lamb (@ElevateLamb) September 9, 2022

So, this is my suggestion: don’t aim to conquer programming languages or produce full-blown applications to rival a LegalTech company you’ve always admired on your own. Focus instead on developing proof of concepts or pushing the tools you are already familiar with as far as you can go. In addition, look at no code or low code alternatives to get easy wins.

By limiting the scope of your ambitions, you’d be able to focus on learning the things you need to produce quick and impactful results. The reinforcement from such quick wins would improve your confidence in your coding skills and abilities.

There might be a day when your project has the makings of a killer app. When that time comes, I am sure that you will decide that going solo is not only impossible but also a bad idea as well. Apps are pretty complex today, so I honestly think it’s unrealistic to rely on yourself to make them.

Tip 2: Follow what interests you

Muddy HandsPhoto by Sandie Clarke / Unsplash

It’s related to tip 1 — you’d probably be able to learn faster and more effectively if you are doing things related to what you are already doing. For lawyers, this means doing your job, but with code. A great example of this is docassemble, which is an open-source system for guided interviews and document assembly.

When you do docassemble, you would try to mimic what you do in practice. For example, crafting questions to get the information you need from a client to file a document or create a contract. However, instead of interviewing a person directly, you will be doing this code.

In the course of my travails looking for projects which interest me, I found the following interesting:

  • Rules as Code: I found Blawx to be the most user-friendly way to get your hands dirty on the idea that legislation, codes and regulations can be code.
  • Docassemble: I mentioned this earlier in this tip
  • Natural Language Processing: Using Artificial Intelligence to process text will lead you to many of the most exciting fields these days: summarisation, search and question and answer. Many of these solutions are fascinating when used for legal text.

I wouldn’t suggest that law is the only subject that lawyers find interesting. I have also spent time trying to create an e-commerce website for my wife and getting a computer to play Monopoly Junior 5 million times a day.

Such “fun” projects might not have much relevance to your professional life, but I learned new things which could help me in the future. E-commerce websites are the life of the internet today, and I experiment with the latest cloud technologies. Running 5 million games in a day made me think harder about code performance and how to achieve more with a single computer.

Tip 3: Develop in the open

Waiting for the big show...Photo by Barry Weatherall / Unsplash

Not many people think about this, so please hang around.

When I was a kid, I had already dreamed of playing around with code and computers. In secondary school, a bunch of guys would race to make the best apps in the class (for some strange reason, they tend to revolve around computer games). I learned a lot about coding then.

As I grew up and my focus changed towards learning and building a career in law, my coding skills deteriorated rapidly. One of the obvious reasons is that I was doing something else, and working late nights in a law firm or law school is not conducive to developing hobbies.

I also found community essential for maintaining your coding skills and interest. The most straightforward reason is that a community will help you when encountering difficulties in your nascent journey as a coder. On the other hand, listening and helping other people with their coding problems also improves your knowledge and skills.

The best thing about the internet is that you can find someone with similar interests as you — lawyers who code. On days when I feel exhausted with my day job, it’s good to know that someone out there is interested in the same things I am interested in, even if they live in a different world. So it would be best if you found your tribe; the only way to do that is to develop in the open.

  • Get your own GitHub account, write some code and publish it. Here's mine!
  • Interact on social media with people with the same interests as you. You’re more likely to learn what’s hot and exciting from them. I found Twitter to be the most lively place. Here's mine!
  • Join mailing lists, newsletters and meetups.

I find that it’s vital to be open since lawyers who code are rare, and you have to make a special effort to find them. They are like unicorns🦄!

Conclusion

So, do lawyers need to code? To me, you need a lot of drive to learn to code and build a career in law in the meantime. For those set on setting themselves apart this way, I hope the tips above can point the way. What other projects or opportunities do you see that can help lawyers who would like to code?

#Lawyers #Programming #LegalTech #blog #docassemble #Ideas #Law #OpenSource #RulesAsCode

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

Introduction

You must have worked hard to get here! We are almost at the end now.

Our journey took us from providing the user experience, figuring out what should happen in the background, and interacting with an external service.

In this part, we ask docassemble to provide a file for the user to download.

Provisioning a File

When we left part 2, this was our result screen.

    event: final_screen
    question: |
      Download your sound file here.
    subquestion: |
      The audio on your text has been generated.
      
      You can preview it here too.
      
      <audio controls>
       <source type="audio/mpeg">
       Your browser does not support playing audio.
      </audio>
      
      Press `Back` above if you want to modify the settings and generate a new file,
      or click `Restart` below to begin a new request.
    buttons:
      - Exit: exit
      - Restart: restart

There are two places where you need an audio file.

  1. In the “question”, a link to the file is provided in “here” for download.
  2. The audio preview widget (the thing which you click to play) also needs a link to the file to function.

Luckily for us, docassemble provides a straightforward way to deal with files on the server. Simply stated, create a variable of type DAFile to hold a reference to the file, save the data to the file and then use it for those links in the results screen.

Let’s get started. Add this block to your main.yml file.

    ---
    objects:
      - generated: DAFile
    ---

This block creates an object called “generated”, which is a DAFile. Now your interview code can use “generated”.

Add the new line in the mandatory block we created in Part 2.

    mandatory: True
    code: |
    # The next line is new
      generated.initialize(filename="output.mp3") 
      tts_task
      if tts_task.ready():
        final_screen
      else:
        waiting_screen

This code initialises “generated” by getting the docassemble server to provision it. If you use “generated” before initialising it, docassemble raises an error. 👻 (You will only get this error if you use the DAFile to create a file)

Now your background action needs access to “generated”. Pass it in a keyword parameter in the background action you created in Part 3.

    code: |
      tts_task = background_action(
        'bg_task', 
        text_to_synthesize=text_to_synthesize, 
        voice=voice, 
        speaking_rate=speaking_rate, 
        pitch=pitch,
    # This is the new keyword parameter
        file=generated  
      )

Now that your background action has the file, use it to save the audio content. Add the new lines below to bg_task that you also created in Part 3.

    event: bg_task
    code: |
      audio = get_text_to_speech(
        action_argument('text_to_synthesize'),
        action_argument('voice'),
        action_argument('speaking_rate'),
        action_argument('pitch'),
      )
    # The next three lines are new
      file_output = action_argument('file') 
      file_output.write(audio, binary=True) 
      file_output.commit() 
      background_response()

We assign the file to a new variable in the background task and then use it to write the audio (make sure it is in binary format as MP3s are not text). After that, commit the file to save it in the server or your external storage, depending on your configuration. (The above method are from DAFile. You can read more details about what they do and other methods here.)

Now that the file is ready, we can plunk it into our results screen. We are providing URLs here so that your user can download them from the browser. If you used paths, that would not work because it is the server's file system. Modify the lines in the results screen block.

    event: final_screen
    question: |
    # Modify the next line
      Download your sound file **[here](${generated.url_for(attachment=True)}).** 
    subquestion: |
      The audio on your text has been generated.
      
      You can preview it here too.
      
      <audio controls>
    # Modify the next line
       <source src="${generated.url_for()}" type="audio/mpeg"> 
       Your browser does not support playing audio.
      </audio>
      
      Press `Back` above if you want to modify the settings and generate a new file,
      or click `Restart` below to begin a new request.
    buttons:
      - Exit: exit
      - Restart: restart

To get the URL for a DAFIle, use the url_for method. This lets you have an address you can use for downloading or the web browser.

Conclusion

Congratulations! You are now ready to run the interview. Give it a go and see if you can download the audio of a text you would like spoken. (If you are still at the Playground, you can click “Save and Run” to ensure your work is safe and test it around a bit.)

This Text to Speech docassemble interview is relatively straightforward to me. Nevertheless, its simplicity also showcases several functions which you may want to be familiar with. Hopefully, you now have an idea of dealing with external services. If you manage to hook up something interesting, please share it with the community!

Bonus: Trapping errors and alerting the users

The code so far is enough to provide users with hours of fun (hopefully not at your expense). However, there are edge cases which you should consider if you plan to make your interview more widely available.

Firstly, while it's pretty clear in this tutorial that you should have updated your Configuration so that this interview can find your service account, this doesn't always happen for others. Admins might have overlooked it.

Add this code as the first mandatory code block of main.yml (before the one we wrote in Part 3):

    mandatory: True
    code: |
      if get_config('google') is None or 'tts service account' not in get_config('google'):   
        if get_config('error notification email') is not None:
          send_email(to=get_config('error notification email'), 
            subject='docassemble-Google TTS raised an error', 
            body='You need to set service account credentials in your google configuration.' )
        else:
          log('docassemble-Google TTS raised an error -- You need to set service account credentials in your google configuration.')
          
        message('Error: No service account for Google TTS', 'Please contact your administrator.')

Take note that if you add more than one mandatory block, they are called in the order of their appearance in the interview file. So if you put this after the mandatory code block defining our processes, the process gets called before checking whether we should run this code in the first place.

This code block does a few things. Firstly it checks whether there is a “google” directive or a “tts service account” directive in the “google directive”. If it doesn't find any tts service account information, it checks whether the admin has set an error notification email in the Configuration. If it does, the server will send an email to the admin email to report the issue. If it doesn't, it prints the error on docassemble.log, one of the logs in the server. (If the admin doesn't check his email or logs, I am unsure how we can help the admin.)

This mandatory check before starting the interview is helpful to catch the most obvious error – no configuration. However, you can pass this check by putting nonsense in the “tts service account”. Google is not going to process this. There may be other errors, such as Google being offline.

Narrowing down every possible error will be very challenging. Instead, we will make one crucial check: the code did save a file at the end of the process. Even if we aren't going to be able to tell the user what went wrong, at least we spared the user the confusion of finding out that there was no file to download.

First, let's write the code that makes the check. Add this new code block.

    event: file_check
    code: |
      path = generated.path()
      if not os.path.exists(path):
        if get_config('error notification email') is not None:
          send_email(to=get_config('error notification email'), 
            subject='docassemble-Google TTS raised an error', 
            body='No file was saved in this interview.' )
        else:
          log('docassemble-Google TTS raised an error -- No audio file was saved in this interview.')
        message('Error: No audio file was saved', 'We are not sure why. Please try again. If the problem persists, contact your administrator.')

This code checks whether the audio file (generated, a DAFile) is an actual file or an apparition. If it doesn't exist, the admin receives a message. The user is also alerted to the failure.

We would need to add a need directive to our results screen so that the check is made before the final screen to download the file is shown.

    event: final_screen 
    need:  # Add this line
      - file_check  # Add this line
    question: |
      Download your sound file **[here](${generated.url_for(attachment=True)}).**
    subquestion: |
      The audio on your text has been generated.
      
      You can preview it here too.
      
      <audio controls>
       <source src="${generated.url_for()}" type="audio/mpeg">
       Your browser does not support playing audio.
      </audio>
      
      Press `Back` above if you want to modify the settings and generate a new file,
      or click `Restart` below to begin a new request.
    buttons:
      - Exit: exit
      - Restart: restart

We would also need to import the python os standard library to make the check on our system. Add this new block near the top of our main.yml file.

    imports:
      - os.path

There you have it! The interview checks before you start whether there's a service account. It also checks before showing you the final screen whether your request succeeded and if an audio file is ready to download.

👈🏻 Go to the previous part.

☝🏻Return to the overview of this tutorial.

#tutorial #Python #Programming #docassemble #Google #TTS #LegalTech

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

Introduction

So far, all our work is on our docassemble install, which has been quite a breeze. Now we come to the most critical part of this tutorial: working with an external service, Google Text to Speech. Different considerations come into play when working with others.

In this part, we will install a client library from Google. We will then configure the setup to interact with Google’s servers and write this code in a separate module, google_tts.py. At the end of this tutorial, your background action will be able to call the function get_text_to_speech and get the audio file from Google.

1. A quick word about APIs

The term “API” can be used loosely nowadays. Some people use it to describe how you can use a program or a software library. In this tutorial, an API refers to a connection between computer programs. Instead of a website or a desktop program, we’re using Python and docassemble to interact with Google Text to Speech. In some cases, like Google Text to Speech, an API is the only way to work with the program.

The two most common ways to work with an API over the internet are (1) using a client library or (2) RESTful APIs. There are pros and cons to working with any one of these options. In this tutorial we are going to go with a client library that Google provided. This allows us to work with the API in a programming language we are familiar with, Python. RESTful APIs can have more support and features than a client language (especially if the programming language is not popular). Still, you’d need to know your way around the requests and similar packages if you want to use them in Python.

2. Install the Client Library in your docassemble

Before we can start using the client library, we need to ensure that it’s there in our docassemble install. Programming in Python can be very challenging because of issues like this:

source: https://imgs.xkcd.com/comics/python_environment.png

Luckily, you will not face this problem if you’re using docker for your docassemble install (which most people do). Do this instead:

  1. Leave the Playground and go to another page called “Package Management”. (If you don’t see this page, you need to be either an admin or a developer)
  2. Under Install or update a package, specify google-cloud-texttospeech as the package to find on PyPI
  3. Click Update, and wait for the screen to show that the install is OK. (This takes time as there are quite a few dependencies to install)
  4. Verify that the google-cloud-texttospeech package has been installed by checking out the list of packages installed.

3. Set up a Text To Speech service account in docassemble

At this point, you should have obtained your Google Cloud Platform service account so that you can access the Text to Speech API. If you haven’t done so, please follow the instructions here. Take note that we will need your key information in JSON format. You don’t need to “Set your authentication environment variable” for this tutorial.

If you have not realised it yet, the key information in JSON format is a secret. While Google’s Text to Speech platform has a generous free tier, the service is not free. So, expect to pay Google if somebody with access to your account tries to read The Lord of the Rings trilogy. In line with best practices, secrets should be kept in a private and secure place, which is not your code repository. Please don’t include your service account details in your playground files!

Luckily, you can store this information in docassemble’s Configuration, which someone can’t access without an admin role and is not generally publicly available. Let’s do that by creating a directive google with a sub-directive of tts service account. Go to your configuration page and add these directives. Then fill out the information in JSON format you received from Google when you set up the service account.

In this example, the lines you will add to the Configuration should look like lines 118 to 131.

4. Putting it all together in the google_tts.py module

Now that our environment is set up, it’s time to create our get_speech_from_text function.

Head back to the Playground, Look for the dropdown titled “Folders”, click it, then select “Modules”.

Look for the editor and rename the file as google_tts.py. This is where you will enter the code to interact with Google Text to Speech. If you recall in part 3, we had left out a function named get_text_to_speech. We were also supposed to feed it with the answers we collected from the interviews we wrote in part 2. Let’s enter the signature of the function now.

    def get_text_to_speech(text_to_synthesize, voice, speaking_rate, pitch):
      //Enter more code here
      return

Since our task is to convert text to speech, we can follow the code in the example provided by Google.

A. Create the Google Text-to-Speech client

Using the Python client library, we can create a client to interact with Google’s service.

We need credentials to use the client to access the service. This is the secret you set up in step 3 above. It’s in docassemble’s configuration, under the directive google with a sub-directive of tts service account. Use docassemble’s get_config to look into your configuration and get the secret tts service account as a JSON.

With the secret to the service account, you can pass it to the class factory function and let it do the work.

    def get_text_to_speech(text_to_synthesize, voice, speaking_rate, pitch):
        from google.cloud import texttospeech
        import json
        from docassemble.base.util import get_config
    
        credential_info = json.loads(get_config('google').get('tts service account'), strict=False)
    
        client = texttospeech.TextToSpeechClient.from_service_account_info(credential_info)

Now that the client is ready with your service account details, let's get some audio.

B. Specify some options and submit the request

The primary function to request Google to turn text into speech is synthesize_speech. The function needs a bunch of stuff — the text to convert, a set of voice options, and options for your audio file. Let’s create some with the answers to the questions in part 2. Add these lines of code to your function.

The text to synthesise:

    input_text = texttospeech.SynthesisInput(text=text_to_synthesize)

The voice options:

    voice = texttospeech.VoiceSelectionParams(
            language_code="en-US",
            name=voice,
        )

The audio options:

    audio_config = texttospeech.AudioConfig(
            audio_encoding=texttospeech.AudioEncoding.MP3,
            speaking_rate=speaking_rate,
            pitch=pitch,
        )

Note that we did not allow all the options to be customised by the user. You can go through the documentation yourself to figure out what options you need or don’t need to worry the user. If you think the user should have more options, you’re free to write your questions and modify the code.

Finally, submit the request and return the audio.

    response = client.synthesize_speech(
            request={"input": input_text, "voice": voice, "audio_config": audio_config}
        )
    
    return response.audio_content

Voila! The client library could call Google using your credentials and get your personalised result.

5. Let’s go back to our interview

Now that you have written your function, it’s time to let our interview know where to find it.

Go back to the playground, and add this new block in your main.yml file.

    ---
    modules:
      - .google_tts
    ---

This block tells the interview that some of our functions (specifically, the get_text_to_speech function) is found in the google_tts module.

Conclusion

At the end of this part, you have written your google_tts.py module and included it in your main.yml. You should also know how to install your python package to docassemble and edit your configuration file.

Well, that leaves us with only one more thing to do. We’ve got our audio content; now we just need to get it to the user. How do we do that? What’s that? DAFile? Find out in the next part.

👉🏻 Go to the final part.

👈🏻 Go back to the previous part.

☝🏻 Check out the overview of this tutorial.

#tutorial #docassemble #LegalTech #Google #TTS #Programming #Python

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

Introduction

In January 2022, the 2020 Revised Edition of over 500 Acts of Parliament (the primary legislation in Singapore) was released. It’s a herculean effort to update so many laws in one go. A significant part of that effort is to “ensure Singapore’s laws are understandable and accessible to the public” and came out of an initiative named Plain Laws Understandable by Singaporeans (or Plus).

Keeping Singapore laws accessible to all – AGC, together with the Law Revision Committee, has completed a universal revision of Singapore’s Acts of Parliament! pic.twitter.com/76TnrNCMUq

— Attorney-General's Chambers Singapore (@agcsingapore) December 21, 2021

After reviewing the list of changes they made, such as replacing “notwithstanding” with “despite”, I frankly felt underwhelmed by the changes. An earlier draft of this article was titled “PLUS is LAME”. The revolution is not forthcoming.

I was bemused by my strong reaction to a harmless effort with noble intentions. It led me to wonder how to evaluate a claim, such as whether and how much changing a bunch of words would lead to a more readable statute. Did PLUS achieve its goals of creating plain laws that Singaporeans understand?

In this article, you will be introduced to well-known readability statistics such as Flesch Reading Ease and apply them to laws in Singapore. If you like to code, you will also be treated to some Streamlit, Altair-viz and Python Kung Fu, and all the code involved can be found in my Github Repository.

GitHub – houfu/plus-explorer: A streamlit app to explore changes made by PLUSA streamlit app to explore changes made by PLUS. Contribute to houfu/plus-explorer development by creating an account on GitHub.GitHubhoufuThe code used in this project is accessible in this public repository.

How would we evaluate the readability of legislation?

Photo by Jamie Street / Unsplash

When we say a piece of legislation is “readable”, we are saying that a certain class of people will be able to understand it when they read it. It also means that a person encountering the text will be able to read it with little pain. Thus, “Plain Laws Understandable by Singaporeans” suggests that most Singaporeans, not just lawyers, should be able to understand our laws.

In this light, I am not aware of any tool in Singapore or elsewhere which evaluates or computes how “understandable” or readable laws are. Most people, especially in the common law world, seem to believe in their gut that laws are hard and out of reach for most people except for lawyers.

In the meantime, we would have to rely on readability formulas such as Flesch Reading Ease to evaluate the text. These formulas rely on semantic and syntactic features to calculate a score or index, which shows how readable a text is. Like Gunning FOG and Chall Dale, some of these formulas map their scores to US Grade levels. Very approximately, these translate to years of formal education. A US Grade 10 student would, for example, be equivalent to a Secondary four student in Singapore.

After months of mulling about, I decided to write a pair of blog posts about readability: one that's more facts oriented: (https://t.co/xbgoDFKXXt) and one that's more personal observations (https://t.co/U4ENJO5pMs)

— brycew (@wowitisbryce) February 21, 2022

I found these articles to be a good summary and valuable evaluation of how readability scores work.

These formulas were created a long time ago and for different fields. For example, Flesch Reading Ease was developed under contract to the US Navy in 1975 for educational purposes. In particular, using a readability statistic like FRE, you can tell whether a book is suitable for your kid.

I first considered using these formulas when writing interview questions for docassemble. Sometimes, some feedback can help me avoid writing rubbish when working for too long in the playground. An interview question is entirely different from a piece of legislation, but hopefully, the scores will still act as a good proxy for readability.

Selecting the Sample

Browsing vinyl music at a fairPhoto by Artificial Photography / Unsplash

To evaluate the claim, two pieces of information regarding any particular section of legislation are needed – the section before the 2020 Edition and the section in the 2020 Edition. This would allow me to compare them and compute differences in scores when various formulas are applied.

I reckon it’s possible to scrape the entire website of statues online, create a list of sections, select a random sample and then delve into their legislative history to pick out the sections I need to compare. However, since there is no API to access statutes in Singapore, it would be a humongous and risky task to parse HTML programmatically and hope it is created consistently throughout the website.

Mining PDFs to obtain better text from DecisionsAfter several attempts at wrangling with PDFs, I managed to extract more text information from complicated documents using PDFMiner.Love.Law.Robots.HoufuIn one of my favourite programming posts, I extracted information from PDFs, even though the PDPC used at least three different formats to publish their decisions. Isn’t Microsoft Word fantastic?

I decided on an alternative method which I shall now present with more majesty:

The author visited the subject website and surveyed various acts of Parliament. When a particular act is chosen by the author through his natural curiosity, he evaluates the list of sections presented for novelty, variety and fortuity. Upon recognising his desired section, the author collects the 2020 Edition of the section and compares it with the last version immediately preceding the 2020 Edition. All this is performed using a series of mouse clicks, track wheel scrolling, control-Cs and control-Vs, as well as visual evaluation and checking on a computer screen by the author. When the author grew tired, he called it a day.

I collected over 150 sections as a sample and calculated and compared the readability scores and some linguistic features for them. I organised them using a pandas data frame and saved them to a CSV file so you can download them yourself if you want to play with them too.

Datacsv Gzipped file containing the source data of 152 sections, their content in the 2020 Rev Edn etc data.csv.gz 76 KB download-circle

Exploring the Data with Streamlit

You can explore the data associated with each section yourself using my PLUS Explorer! If you don’t know which section to start with, you can always click the Random button a few times to survey the different changes made and how they affect the readability scores.

Screenshot of PLUS Section Explorer: https://share.streamlit.io/houfu/plus-explorer/main/explorer.py

You can use my graph explorer to get a macro view of the data. For the readability scores, you will find two graphs:

  1. A graph that shows the distribution of the value changes amongst the sample
  2. A graph that shows an ordered list of the readability scores (from most readable to least readable) and the change in score (if any) that the section underwent in the 2020 Edition.

You can even click on a data point to go directly to its page on the section explorer.

Screenshot of PLUS graph explorer: https://share.streamlit.io/houfu/plus-explorer/main/graphs.py

This project allowed me to revisit Streamlit, and I am proud to report that it’s still easy and fun to use. I still like it more than Jupyter Notebooks. I tried using ipywidgets to create the form to input data for this project, but I found it downright ugly and not user-friendly. If my organisation forced me to use Jupyter, I might reconsider it, but I wouldn’t be using it for myself.

Streamlit — works out of the box and is pretty too. Here are some features that were new to me since I last used Streamlit probably a year ago:

Pretty Metric Display

Metric display from Streamlit

My dear friends, this is why Streamlit is awesome. You might not be able to create a complicated web app or a game using Streamlit. However, Steamlit’s creators know what is essential or useful for a data scientist and provide it with a simple function.

The code to make the wall of stats (including their changes) is pretty straightforward:

st.subheader('Readability Statistics') # Create three columns flesch, fog, ari = st.columns(3)

# Create each column flesch.metric(“Flesch Reading Ease”, dataset[“currentfleschreadingease”][sectionexplorerselect], dataset[“currentfleschreadingease”][sectionexplorer_select] - dataset[“previousfleschreadingease”][sectionexplorerselect])

# For Fog and ARI, the lower the better, so delta colour is inverse

fog.metric(“Fog Scale”, dataset[“currentgunningfog”][sectionexplorerselect], dataset[“currentgunningfog”][sectionexplorerselect] - dataset[“previousgunningfog”][sectionexplorerselect], delta_color=“inverse”)

ari.metric(“Automated Readability Index”, dataset[“currentari”][sectionexplorerselect], dataset[“currentari”][sectionexplorer_select] - dataset[“previousari”][sectionexplorerselect], delta_color=“inverse”)

Don’t lawyers deserve their own tools?

Now Accepting Arguments

Streamlit apps are very interactive (I came close to creating a board game using Streamlit). Streamlit used to suffer from a significant limitation — except for the consumption of external data, you can’t interact with it from outside the app.

It’s at an experimental state now, but you can access arguments in its address just like an HTML encoded form. Streamlit has also made this simple, so you don’t have to bother too much about encoding your HTML correctly.

I used it to communicate between the graphs and the section explorer. Each section has its address, and the section explorer gets the name of the act from the arguments to direct the visitor to the right section.

# Get and parse HTTP request queryparams = st.experimentalgetqueryparams()

# If the keyword is in the address, use it! if “section” in queryparams: sectionexplorerselect = queryparams.get(“section”)[0] else: sectionexplorerselect = 'Civil Law Act 1909 Section 6'

You can also set the address within the Streamlit app to reduce the complexity of your app.

# Once this callback is triggered, update the address def onselect(): st.experimentalsetqueryparams(section=st.session_state.selectbox)

# Select box to choose section as an alternative. # Note that the key keyword is used to specify # the information or supplies stored in that base. st.selectbox(“Select a Section to explore”, dataset.index, onchange=onselect, key='selectbox')

So all you need is a properly formed address for the page, and you can link it using a URL on any webpage. Sweet!

Key Takeaways

Changes? Not so much.

From the list of changes, most of the revisions amount to swapping words for others. For word count, most sections experienced a slight increase or decrease of up to 5 words, and a significant number of sections had no change at all. The word count heatmap lays this out visually.

Unsurprisingly, this produced little to no effect on the readability of the section as computed by the formulas. For Flesch Reading Ease, a vast majority fell within a band of ten points of change, which is roughly a grade or a year of formal education. This is shown in the graph showing the distribution of changes. Many sections are centred around no change in the score, and most are bound within the band as delimited by the red horizontal rulers.

This was similar across all the readability formulas used in this survey (Automated Readability Index, Gunning FOG and Dale Chall).

On the face of it, the 2020 Revision Edition of the laws had little to no effect on the readability of the legislation, as calculated by the readability formulas.

Laws remain out of reach to most people

I was also interested in the raw readability score of each section. This would show how readable a section is.

Since the readability formulas we are considering use years of formal schooling as a gauge, we can use the same measure to locate our target audience. If we use secondary school education as the minimum level of education (In 2020, this would cover over 75% of the resident population) or US Grade 10 for simplicity, we can see which sections fall in or out of this threshold.

Most if not all of the sections in my survey are out of reach for a US Grade 10 student or a person who attained secondary school education. This, I guess, proves the gut feeling of most lawyers that our laws are not readable to the general public in Singapore, and PLUS doesn’t change this.

Take readability scores with a pinch of salt

Suppose you are going to use the Automated Readability Index. In that case, you will need nearly 120 years of formal education to understand an interpretation section of the Point-to-Point Passenger Transport Industry Act.

Section 3 of the Point-to-Point Passenger Transport Industry Act makes for ridiculous reading.

We are probably stretching the limits of a tool made for processing prose in the late 60s. It turns out that many formulas try to average the number of words per sentence — it is based on the not so absurd notion that long sentences are hard to read. Unfortunately, many sections are made up of several words in 1 interminable sentence. This skews the scores significantly and makes the mapping to particular audiences unreliable.

The fact that some scores don’t make sense when applied in the context of legislation doesn’t invalidate its point that legislation is hard to read. Whatever historical reasons legislation have for being expressed the way they are, it harms people who have to use them.

In my opinion, the scores are useful to tell whether a person with a secondary school education can understand a piece. This was after all, what the score was made for. However, I am more doubtful whether we can derive any meaning from a score of, for example, ARI 120 compared to a score of ARI 40.

Improving readability scores can be easy. Should it?

Singaporean students know that there is no point in studying hard; you have to study smart.

Having realised that the number of words per sentence features heavily in readability formulas, the easiest thing to do to improve a score is to break long sentences up into several sentences.

True enough, breaking up one long sentence into two seems to affect the score profoundly: see Section 32 of the Defence Science and Technology Agency Act 2000. The detailed mark changes section shows that when the final part of subsection three is broken off into subsection 4, the scores improved by nearly 1 grade.

It’s curious why more sections were not broken up this way in the 2020 Revised Edition.

However, breaking long sentences into several short ones doesn’t always improve reading. It’s important to note that such scores focus on linguistic features, not content or meaning. So in trying to game the score, you might be losing sight of what you are writing for in the first place.

Here’s another reason why readability scores should not be the ultimate goal. One of PLUS’s revisions is to remove gendered nouns — chairperson instead of chairman, his or her instead of his only. Trying to replace “his” with “his or her” harms readability by generally increasing the length of the sentence. See, for example, section 32 of the Weights and Measures Act 1975.

You can agree or disagree whether legislation should reflect our values such as a society that doesn't discriminate between genders. (It's interesting to note that in 2013, frequent legislation users were not enthusiastic about this change.) I wouldn't suggest though that readability scores should be prioritised over such goals.

Here’s another point which shouldn’t be missed. Readability scores focus on linguistic features. They don’t consider things like the layout or even graphs or pictures.

A striking example of this is the interpretation section found in legislation. They aren’t perfect, but most legislation users are okay with them. You would use the various indents to find the term you need.

Example of an interpretation section and the use of indents to assist reading.

However, they are ignored because white space, including indents, are not visible to the formula. It appears to the computer like one long sentence, and readability is computed accordingly, read: terrible. This was the provision that required 120 years of formal education to read.

I am not satisfied that readability should be ignored in this context, though. Interpretation sections, despite the creative layout, remain very difficult to read. That’s because it is still text-heavy, and even when read alone, the definition is still a very long sentence.

A design that relies more on graphics and diagrams would probably use fewer words than this. Even though the scores might be meaningless in this context, they would still show up as an improvement.

Conclusion

PLUS might have a noble aim of making laws understandable to Singaporeans, but the survey of the clauses here shows that its effect is minimal. It would be great if drafters refer to readability scores in the future to get a good sense of whether the changes they are making will impact the text. Even if such scores have limitations, they still present a sound and objective proxy of the readability of the text.

I felt that the changes were too conservative this time. An opportunity to look back and revise old legislation will not return for a while (the last time such a project was undertaken was in 1985 ). Given the scarcity of opportunity, I am not convinced that we should (a) try to preserve historical nuances which very few people can appreciate, or (b) avoid superficial changes in meaning given the advances in statutory interpretation in the last few decades in Singapore.

Beyond using readability scores that focus heavily on text, it would be helpful to consider more legal design — I sincerely believe pictures and diagrams will help Singaporeans understand laws more than endlessly tweaking words and sentence structures.

This study also reveals that it might be helpful to have a readability score for legal documents. You will have to create a study group comprising people with varying education levels, test them on various texts or legislation, then create a machine model that predicts what level of difficulty a piece of legislation might be. A tool like that could probably use machine models that observe several linguistic features: see this, for example.

Finally, while this represents a lost opportunity for making laws more understandable to Singaporeans, the 2020 Revised Edition includes changes that improve the quality of life for frequent legislation users. This includes changing all the acts of parliaments to have a year rather than the historic and quaint chapter numbers and removing information that is no longer relevant today, such as provisions relating to the commencement of the legislation. As a frequent legislation user, I did look forward to these changes.

It’s just that I wouldn’t be showing them off to my mother any time soon.

#Features #DataScience #Law #Benchmarking #Government #LegalTech #NaturalLanguageProcessing #Python #Programming #Streamlit #JupyterNotebook #Visualisation #Legislation #AGC #Readability #AccesstoJustice #Singapore

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

I love playing with legal data. For me, books specialising in legal data are uncommon, especially those dealing with what’s available on the wild world of the internet today.

That’s why I snapped up Sarah Sutherland’s “Legal Data and Information in Practice”. Ms Sutherland was CEO of CanLII, one of the most admirable LIIs. CanLII is extensive, comprehensive, and packed with great features like noting up and keywords. It even comes in two languages.

Legal Data and Information in Practice: How Data and the Law InteractLegal Data and Information in Practice provides readers with an understanding of how to facilitate the acquisition, management, and use of legal data in organizations such as libraries, courts, governments, universities, and start-ups.Presenting a synthesis of information about legal data that will…Routledge & CRC PressSarah A. Sutherland

The book’s blurb recommends that it is “ essential reading for those in the law library community who are based in English-speaking countries with a common law tradition ”.

Since finishing the book, I found the blurb’s focus way too narrow. This is a book for anyone who loves legal data.

For one, I enjoyed the approachable language. My interaction with legal data has always been pragmatic. Either I was studying for some course, or I needed to find an answer quickly. It will be enough to appreciate the book if you’ve done any of those things. I liked that it didn’t baffle me with impossible or theoretical language. I found myself nodding at several junctures as I reflected on my experience of interacting with legal data as well.

Furthermore, it’s effectively a primer:

  • It’s short. I took a month to finish it at a leisurely place (i.e., in between taking care of children, making sure the legal department runs smoothly, and programming). Oh, and unlike most law books, it has pictures.
  • It effectively explains a broad range of topics. It talks about the challenges of AI and the political and administrative backgrounds of how legal data is provided without overwhelming you. More impressively, I found new areas in this field that I didn’t know about before reading the book, such as the various strategies to acquire legal data and an overview of statistical and machine learning techniques on data.

So, even if you are not a librarian or a legal technologist by profession, this book is still handy for you. I would love more depth, and maybe that’s some scope for a 2nd edition. In any case, Sarah Sutherland’s “Legal Data and Information in Practice” is a great starting point for everyone. Reading it will level up your ability to discuss and evaluate what’s going on in this exciting field.

  • * *

I am sorry for being a sucker — I am the kind of guy who watches movies to swoon at sweeping visages of my home jurisdiction, Singapore. I enjoyed Crazy Rich Asians, even though it’s fake.

So, I couldn’t resist looking for references to Singapore in the book. Luckily for me, Singapore is mentioned several times in the book. It’s described as “an interesting example of what can happen if a government is willing to invest heavily in developing capacity in legal computing and data use”. I’m not convinced that LawNet is like an LII, but among other points raised, such as the infrastructure, availability and formats are still much better here than in the rest of the common law world.

The more interesting point is that Singapore, as a small jurisdiction, would usually find its dataset smaller. That’s why experimenting on making models trained on other kinds of data effective on yours is crucial. (I think the paper cited in the book is an excellent example of this.) Other facets are relevant when you have fewer data and resources: what kinds of legal data should one focus on and the strategies to acquire them.

The challenges of a smaller dataset seem to be less exciting because fewer people are staring at them. However, I would suggest that these challenges are more prevalent than you would expect — companies and organisations also have smaller datasets and fewer resources. What would work for Singapore should be of interest to many others.

There’s always something to be excited about in this field. What do you think?

#BookReview #ArtificalIntelligence #DataMining #Law #LegalTech #MachineLearning #NaturalLanguageProcessing #Singapore #TechnologyLaw

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

A new year brings new beginnings, except the light at the end of the tunnel shines harshly on what we have left behind.

In a normal year, the start of the year for the Legal fraternity brings the opening of the Legal Year. A ceremony is also held where representatives of the bar, judiciary and attorney-general chambers give speeches with courtesy and camaraderie.

However, this year’s speech from the bar brings startling news. 538 lawyers left the profession in Singapore last year, a year on year increase of over 30% per cent.

Far more disturbing is the revelation that the departures are concentrated on young lawyers of less than 5 years of practice. For a long time, the bar has been concerned about the “hourglass” distribution of practice. Fat at the bottom and the top, where the youngest and oldest in the profession are, and thin in the middle. Is burnout starting earlier now?

It’s heartening that such problems are being confronted right now. In the said speech, Law Society President Adrian Tan puts it in this manner:

The 21st-century lawyers are different. They want to marry, not the law, but a human being. They, too, want to work hard. They, too, want their work to have meaning. But they also want other things that human beings want: to have children, to build a home, to have a life outside the law.

Even one of the new Senior Counsels from the bar (it’s interesting to note that both of them are female this year) put the concern in a similar manner.

I hope to be a role model of sorts to some to stay the course that much longer. I do hope that (this appointment) is a sign to all the young ladies out there that there is more that can be done.

So what would solve young lawyers burning out? Mr Adrian Tan posits that the 21st Century lawyer can have a sustainable career when the “law firm” as a physical place vanishes:

This is the picture I present to you of the New Singapore Lawyer, who works from a laptop, uses technology to collaborate with other lawyers, meets clients virtually, and is not bound to a physical office. Whenever there is a need for sensitive communication, the New Singapore Lawyer will book a secure Zoom pod. If there is a month-long arbitration with opponents in different time zones, the New Singapore Lawyer will use special facilities to cater to those needs... Put another way: the New Singapore Lawyer will spend more time on work, rather than on commuting to work.

This idea appears to have come out from the experience of senior lawyers working from home. It was a strange and foreign experience for everyone.

I like the vision of this “New Singapore Lawyer” (it’s great we finally have a published fiction author as a Law Society President).

However, I experienced many bouts of irony as I waded through its implications. One of the experiences people have from working from home is that without the separation of the workplace, they spent longer hours working. If we want young lawyers to not burn out, bringing work home does not look like a good start.

Another bout of irony came from the “threat” that technology can bring to the legal profession. Last year, the Singapore Academy of Law published a 600-page tome which raison d'etre was to examine how technology impacts the law for a profession that didn’t necessarily welcome it. While legal work done by city law offices is bespoke enough to not be replaced by robots, legal work at the lower end is more susceptible to being automated — people can turn to “googling” to find answers to legal questions rather than pay an hourly rate to a lawyer. The free-wheeling New Singapore Lawyer might not be so carefree after all.

So has working from home made us all love technology a lot more? Maybe, but I would suggest that this happy relationship is likely to be very limited and short-lived. It won’t be enough to overcome the challenges of burnout that young lawyers face at the beginning of their professional lives.

#Singapore #Law #Lawyers #LegalTech #LawSociety #WorkLifeBalance

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

We are almost coming to the end of the year and it's that time where we count our chickens and our eggs. We still have a month ago before the end of 2021 so I will leave my plans for 2022 for another post.

Instead, I would like to record my thanks to the subscribers to who I get the opportunity to regularly send emails. It's always nice to attach a name to the work you do, so you are doing lots to keep me going!

I am embarrassed for sure, but I have prepared a little survey for subscribers. It's your chance to give me feedback on how Love.Law.Robots is doing and to mould what it will be like in the future.

You can click on this image to access the survey. Thank you!

What I am reading now

Web Scraping with Python, 2e: Collecting More Data from the Modern Web : Mitchell, Ryan: Amazon.sg: BooksWeb Scraping with Python, 2e: Collecting More Data from the Modern Web : Mitchell, Ryan: Amazon.sg: BooksRyan MitchellI earn a commission from purchases made through this affiliate link.

  • GPT-3 is generally available now. The language model that made everyone fear for their livelihoods have removed their waitlist. You can sign up and use the models for free for 3 months or 300,000 tokens (roughly words), whichever comes first. GPT-3 generates text, so it's probably best used for writing. Advanced uses (I haven't tried) include generating contract clauses and submissions. I find the safety guidelines very interesting as well. Do check it out!

OpenAI APIAn API for accessing new AI models developed by OpenAI

  • I attended the Jones Day Chair of Commercial Law at Singapore Management University lecture delivered by Professor Dan Katz last week. It was interesting to hear about LegalTech in Singapore and the slide deck is available for you if you missed it. Hopefully, it's the last time I am going to hear COVID being used as a generational marker.

“The Legal Innovation Agenda – Pre and Post COVID” — Last Night, I gave this Virtual Talk for Jones Day Chair of Commercial Law at Singapore Management University … (500 Slides in ~60 minutes) https://t.co/5XqXgo3Mza #LegalTech #LegalInnovation #LegalData

— Computational Legal (@computational) November 17, 2021

Postscript

My post on using docker-compose and traefik is quite charming now that I look back at it. The server is still dutifully running docassemble and others. However, I now have a new problem: I have three computers running docker – a legacy desktop, a NAS and a Raspberry PI. I have always wondered whether I am making full use of them. It's only natural to expand my cloud computing knowledge. I believe that a docker swarm post is the natural follow up.

Docker Swarm RocksDocker Swarm mode ideas and toolslogoLearn about Docker Swarm here!

That's it!

I hope you enjoyed this post. Please do take some time to fill in the Subscriber Survey I will see you again soon!

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu

Feature image

October is drawing to a close, and so the end of the year is almost upon us. It's hard to fathom that I have been stuck working from home for nearly 20 months now. Some countries seemed to have moved on, but I doubt we'd do so in Singapore. Nevertheless, it's time for reflection and thinking about what to do about the future.

What I am reading now

The Importance of Being AuthorisedA recent case shows that practising law as an unauthorised person can have serious effects. What does this hold for other people who may be interested in alternative legal services?Love.Law.Robots.HoufuAn in-depth analysis of a rare and recent local decision touching on this point.

CLM Simplified: Efficient Contracting for Law Departments : Bassli, Lucy Endel: Amazon.sg: BooksCLM Simplified: Efficient Contracting for Law Departments : Bassli, Lucy Endel: Amazon.sg: BooksLucy Endel BassliI earn a commission from purchases made with this link.

  • Do you need a lot of coding or technical skills to use AI? This commentator from Today Online highlights Hugging Face, Gradio and Streamlit and doesn't think so. So have we finally resolved the question of whether lawyers need to code? I still think the answer is very nuanced — one person can compile a graph using free tools quickly, but making it production-ready is tough and won't be free. I agree more with the premise that we need to better empower students and others to “seek out AI services and solutions on their own”. In the Legal field, this starts with having more data out there available for all to use.

Why you don’t need to be an expert to use AI any moreKeeping up with the latest developments in artificial intelligence is like drinking from the proverbial fire hose, as a recent 188-page overview by two tech investors Ian Hogarth and Nathan Benaich would attest.TODAYonline

Post Updates

This week saw the debut of my third feature — “It's Open. It's Free — Public Legal Information in Singapore”. I have been working on it for several months, and it's still a work in progress. I made it as part of my research into what materials to scrape, and I've hinted at the project several times recently. In due course, I want to add more obscure courts and tribunals, including the PDPC and others. You can check the page regularly, or I would mention it here from time to time. I welcome your comments and suggestions on what I should cover.

That's it!

Family Playing A Board Game. An Asian family \(adult male and female and two adolescents, male and female\) sitting around a coffee table playing a board game. Photographer Bill BransonPhoto by National Cancer Institute / Unsplash

At the start of this newsletter, I mentioned that November is the month to be looking forward. 😋 Unfortunately, for the time being, I would be racing to finish articles that I had wanted to write since the pandemic started. This includes my observations from playing Monopoly Junior 5 million times. You can look at a sneak peek of the work in my Streamlit app (if it runs).

In the meantime, I would be trying the weights and cons of using MongoDB or SQL for my scraping project. Storing text and downloads on S3 is pretty straightforward, but where should I store the metadata of the decisions? If anyone has an opinion, I could use some advice!

Thanks for reading, and feel free to reach out!

#Newsletter #ArtificalIntelligence #BookReview #Contracts #DataMining #Law #DataScience #LegalTech #Programming #Singapore #Streamlit #WebScraping

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu